![]() ![]() These fields contain, for example, the name of the author, the title of the document, or the file name. However, from Lucene’s point of view, the documents themselves contain fields. The objects that Lucene works with are documents in every kind of form. To understand this, you have to go back one step. Developers decide which fields they want to include in the index during configuration. Lucene gives users the ability to configure this extraction individually. All terms must be taken from all the documents and stored in the index. In order to build an index, you first need to extract it. In principle, an inverted index is simply a table – the corresponding position is stored for each term. It not only searches HTML documents, but also works with e-mail and PDF files.Īn index – the heart of Lucene – is decisive for the search, since all terms of all documents are stored here. Lucene can also be used for archives, libraries, or even on your home desktop PC. This shows that Lucene is not solely used in the context of the world wide web, even if the searches are mostly found here. This means, quite simply: a program searches a series of text documents for one or more terms that the user has specified. Apache Solr and Elasticsearch are powerful extensions that give the search function even more possibilities. Originally, Lucene was written completely in Java, but now there are also ports to other programming languages. It is open source and free for everyone to use and modify. Lucene is a program library published by the Apache Software Foundation.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |