c# - Address Match Key Algorithm -


I have a list of addresses in two different tables which may be off a bit, Should be able to. For example, the same address can be entered in several ways:

  • 110 Test Centers
  • 110 Test Centers
  • 110 Test Street

Though simple, you can imagine the situation in more complex scanners. I am trying to develop a simple algorithm which will be able to match the above addresses as a key.

For example, the key can be "11TEST" - the first two of 110, the first two of the test and the first two versions of the road. The key in a full match will also be included in the above example in the first 5 of the zipcode, as the full key can look like "11TEST44680".

I am looking for ideas for an effective algorithm or resources. I would like to consider ideas at this time of development. Any idea can be in the hypocrisy code or your language of choice.

We are concerned only with the US address. In fact, we are only looking at the addresses of 250 zip codes in Ohio and Michigan. We do not have access to any postal software, although it will open for ideas for cost effective solution (this will be essentially used once). Please keep in mind that this is an initial dump of data from an official source, so how can users clear how this can be helpful, as I have prepared an application, but I should know that I am the most Better initially, I can match with the address as best possible

We speak Working on a similar algorithm The time I am facing by Dr., USA, Mexico and UK is the problem that they have 3 fields in plain text format in our database [whoever thought that was a good idea to IMHO Should be shot), therefore, trying to control rural routes, normal distribution, large quantity receivers, many countries, province versus state vs county, postal code versus zip code, spelling mistakes no small or noisy Science does not work.

Spelling mistakes alone was not a small thing - especially when you receive countries using the names of France - Saint, Saint, Saint, Stay, Saint, Saint, STS, Stace, Due to the large part of the name of Grand, Grande, Grandes, Grandes, or without a period or without hyphenation, there is no end to performance issues - especially when the Saint Saint or road can be used And correct reference (Ie women versus be entered in manly) or not. If the address is entered correctly on a large scale but what is a wrong province or postal code?

There is a place to start your search that I've found to be really useful to destroy a large part of spelling mistakes. After that, this is mainly a matter of searching for keywords and comparing it against the postal database.

I would be really interested in cooperating with anyone who is currently developing a tool to do this, maybe we can help each other, a common solution I already have I am a part and I have removed all those issues which I have mentioned so far, working on the same problem and helping others to conceive.

Cheers - [on Ben Apps Dot Ca


Comments

Popular posts from this blog

python - Overriding the save method in Django ModelForm -

html - CSS autoheight, but fit content to height of div -

qt - How to prevent QAudioInput from automatically boosting the master volume to 100%? -