University student invents algorithm speeds up internet searches

The algorithm was invented at the IT University in Copenhagen and facilitates internet searches in large databases

A Vietnamese computer scientist working at the IT University in Copenhagen has invented an algorithm that significantly speeds up internet searches in large databases.

Ninh Pham invented the algorithm dubbed 'Odd Sketch' as part of his post doctoral thesis.

As more and more data is uploaded to the internet, search engines must work harder to process and compare the information. Meanwhile computers are also getting faster and as such new algorithms are required if these searches are to remain efficient and utilise all the processing power.

Addressing the problem of similarity searches
Odd Sketch works in relation to 'similarity searches' in which a search engine compares a user's query with that of a large database.

"Similarity search is a core problem for computer scientists." commented Pham on

"If we can compare two pieces of data quicker, then time and money can be saved."

The algorithm created a lot of buzz earlier this year when it was mentioned in a journal article co-authored by Pham, which subsequently won the 'best paper award' at the WWW conference in Seoul.

"Our algorithm is the fastest at comparing two documents, if the documents are of similar nature. It also takes up much less space than existing ones," explained Pham.

READ MORE: New DMI super computer to provide better forecasts

Google to benefit
An organisation that could benefit greatly from Odd Sketch is Google. 

"The huge amounts of data that are constantly being uploaded means Google has to index billions of websites and respond to billions of search queries. The question is how do you manage to answer each and every one?" concludes Pham

The answer is you don't as it's effectively impossible for Google to do so. That is why new algorithms like Odd Sketch are needed as they are designed to effectivise the search process by only searching within a specific section of the enormous sea of data out there.