So, let's start right away! I can warn you, it's not the easiest of subjects, but it's what I did today.
I've built an efficient algorithm for solving the Edit Distance problem.
Basically the Edit Distance represents the "distance" between two words. Most of the time it's the number of operations required to get from one word to the other.
It seems like a simple problem, but the difficulty rests in finding an optimal solution. In this case, the minimal Edit Distance.
When you've got some experience designing algorithms for complicated problems you might already see that this problem will have a high complexity.
But, the complexity can be heavily reduced by using a technique called Dynamic Programming.
In essence it's a technique which stores results of subproblems to later combine these and form the final result for the complete problem.
This requires the problem to have a so called Optimal Substructure. In short this means the main problem can be divided in smaller, overlapping subproblems, which can be combined to calculate the main problems solution.
This is done by recursively solving all the subproblems, and storing their results in for example an array.
So you've got to build a recurrent algorithm for solving the subproblems.
In the case of the Edit Distance problem we can say that the distance between words w1 and w2 can be expressed as the distance between w1' and w2 + the costs of doing the operation w1 -> w1'.
Common operations are:
- Copy, which can only be applied when the characters at a given position in w1 and w2 are the same.
- Replace, which replaces the character in w1 by the character in w2.
- Delete, which deletes a character
- Insert, which inserts a character
When the Edit Distances of all the possible stages in the algorithm are calculated, we can easily walk through our array of results and find the shortest Edit Distance, and the list of operations it's using.
Interesting stuff eh?
Be sure to Google it for more specific information and examples.
Published with Blogger-droid v1.6.3
No comments:
Post a Comment