Daniel Flower Dot Com Banner

Case Representation

The cases are stored in an SQL Server database. A row in the case table includes a case ID, numeric values for price, duration, and number of people, and the IDs of symbols for the other dimensions.

There is a Symbol table which holds all the symbols. The information in the Symbol table includes the name and the ID of the symbol type, such as the holiday type, transportation type, etc.

The similarity between the different symbols is manually defined with the information being stored in the "symbol similarity table". This simply has the IDs of two symbols, and a similarity between zero and one. This allows for asymmetric similarities, for example the difference between changing from a train to the car is more than from a car to a train. The values were set in a web page which displayed symbols in a matrix. Below is an example showing how the transportation was set up: 

Edit Symbol Similarities

 Car Coach Plane Train
This was a simple way to set the similarities, however for the region dimension it was not so easy. This was because of two reasons: firstly, I do not know the difference between, say, Slowakei and Chalkidiki; and secondly, there were 59 regions and hence 3,481 values to fill in. Needless to say, I did not do this manually.

Instead, I simply set each similarity to 1.0 on the diagonal of the matrix (so every region was exactly the same as itself) and 0.5 everywhere else. I decided this was satisfactory because the similarity between different regions would show itself anyway through the different holiday types and prices. For example, two places with a similar holiday type should be similar to each other, two places with similar prices may in general also be similar to each other.

To calculate the similarity between numerical dimensions, I used the following formula: Similarity = (MaximumPossibleValue - | NumberOne - NumberTwo | ) / MaximumPossibleValue  
For example if the maximum possible price was $1000, the requested amount was $500, and we are comparing to a case with a price of $400, then the similarity is (1000 -100) / 1000 = 0.9 or 90%.

With all the data in place the next step was to implement the actual retrieval. See the next page for the implementation details.
Comments for this page
posted by Biju on 3/10/2012 7:27:39 p.m. (NZ time)
Sans Latin Russian Germanekam unam один einsdwitiyam sneucdos два zweitritiyam tertius три dreichaturtham quartus четыреvierpancham quintus пятьffcnfsahstham sextus шесть sechssaptamam septimus семь siebenashtamam octavus восемь achtnavamam nonus девять neundashamam decimus десять zehn
runescape 3 gold
posted by runescape 3 gold on 8/07/2013 5:45:28 p.m. (NZ time)
i can find numerous great solutions basically have difficulty!
Add your comment below
Your Name:
Comment Title: