Modern corporate intranets store vast amounts of documents, procedures, instructions, and organizational knowledge. Traditional keyword-based search often fails when users search for information using terms other than those found in the documents.
Problem: an employee searches for "how to configure access to the payment system," but the document contains the phrase "payment integration configuration." Traditional search won’t find this document, even though it contains the answer to the question.
Solution: RAG (Retrieval-Augmented Generation) with a vector database enables semantic search. The system understands the meaning of the query and finds documents based on context, not just exact word matches.
In this article, we’ll show you how to integrate the Milvus vector database with Open Intranet on Drupal to create intelligent search in corporate knowledge bases.
In this article:
- What is RAG and why is it important for intranets?
- Open Intranet: starter kit for corporate intranets
- What is Milvus? Vector database for RAG
- What does the integration architecture look like? Open Intranet + Milvus RAG
- How to install Open Intranet with the Milvus RAG option? Step by step
- How does RAG search work? Examples of use
- How can Milvus RAG be useful in organizations?
- What technologies were used in the Open Intranet + Milvus vector database demo?
- Frequently asked questions (FAQ) about Milvus vector database on the intranet?
- Milvus vector database on the intranet – summary
- Do you need to implement a vector database on your intranet?
What is RAG and why is it important for intranets?
RAG (Retrieval-Augmented Generation) is a technology that combines semantic search with AI-generated responses. In the context of corporate intranets, RAG offers many benefits.
Semantic search
Instead of searching for exact keywords, the system understands the user's intent.
Example:
- User query: "how to reset the administrator password."
- Traditional search: searches for documents containing exactly those words.
- Semantic search: finds documents about "recovering access," "changing credentials," or "restoring admin privileges." Even if they don't contain those exact words.
Better results for users
Analysis of client queries shows that 66% of organizations looking for intranet solutions require advanced search or AI search. This is no coincidence – in large organizations with thousands of documents, traditional search is no longer sufficient. Artificial intelligence understands the context and intent of the user, making it ideal for working with extensive knowledge bases.
Scalability
Vector databases, such as Milvus, can handle millions of documents while maintaining fast response times. This is crucial for organizations with extensive knowledge bases.
Performance
Fast similarity search even in large data sets. Milvus uses advanced indexing algorithms (HNSW, IVF) to optimize queries.
Flexibility
Expandable with additional AI features:
- AI chatbots with access to documents or company knowledge.
- Automatic document tagging.
- Recommendations for similar content.
- Sentiment analysis in company content.
Open Intranet: starter kit for corporate intranets
Open Intranet is an open source starter kit on Drupal for building corporate intranets.
It includes ready-made intranet features such as:
- collaboration and communication,
- news and events system,
- document sharing
- knowledge base,
- employee directory.
The system allows organizations to quickly launch a flexible internal portal without having to build everything from scratch.

Open Intranet system with a ready-made knowledge base
What is Milvus? Vector database for RAG
Milvus is an open source vector database designed specifically for storing, indexing, and searching vector representations of text (embeddings).
How does Milvus work in the context of RAG?
- Indexing: documents from the intranet are processed by an AI model (e.g., OpenAI text-embedding-3-small), which creates vectors representing the meaning of the text.
- Storage: the vectors are stored in Milvus along with metadata (title, URL, date).
- Search: when a user asks a question, the query is also converted into a vector, and Milvus finds the most similar documents based on vector distance.
- Return of results: the system returns documents sorted by semantic similarity.
Why Milvus vector database?
- Open Source: full control over data, no vendor lock-in.
- Scalability: supports millions of vectors with fast response times.
- Ready integration: the ai_vdb_provider_milvus module for Drupal facilitates integration.
- Standalone mode: for smaller organizations, it can be run in standalone mode on a single server.
- Ready for production use: scalable to a cluster for larger organizations.
Read also: Recommended Vector Databases (VDB) for Drupal – Overview of AI Providers
What does the integration architecture look like? Open Intranet + Milvus RAG
The diagram below shows the complete integration architecture:

Chart created using the Mermaid tool
What are the specific components of the integration system?
Each element of the architecture plays a specific role, ensuring smooth query processing and data management across the entire RAG environment. Below, we describe how the individual components work together within Open Intranet.
DDEV Application Stack
This development environment provides a ready-made infrastructure for running an intranet with Milvus, automating most of the configuration. This allows the entire system to be run locally in a matter of minutes.
Web Container (Drupal Application)
- Drupal 11 with PHP 8.3.
- nginx-fpm as a web server.
- Ports: 80 (HTTP), 443 (HTTPS), 8025 (Mailpit).
- Integration with Milvus via the ai_vdb_provider_milvus module.
MariaDB (Database)
- Database for Drupal.
- Version: MariaDB 10.11.
- Stores all Drupal data (content, config, users).
Milvus RAG Stack
The set of services that make up the Milvus RAG Stack is responsible for storing vectors, metadata, and executing search queries. Each component of the system plays a distinct role in ensuring high performance and stability.
etcd (Storage Layer)
- Metadata storage and coordination.
- Port: 2379.
- Stores: collection schemas, indexes, configurations.
- Why etcd? It’s a distributed key-value store used by Milvus to store metadata and coordinate between components. Without etcd, Milvus cannot function.
MinIO (Storage Layer)
- Object storage for vector data.
- Ports: 9000 (S3 API), 9001 (Web Console).
- Stores: vectors, segments, binary files.
- Why MinIO? It’s an object data store compatible with the S3 API. Milvus uses it to store actual vector data and segments. MinIO allows for scaling and efficient management of large amounts of vector data.
Milvus (Core Engine)
- The main vector search engine.
- Ports: 19530 (API), 9091 (Health Check).
- Functions:
- storage of embeddings in the form of vectors,
- semantic similarity search,
- indexing and query optimization,
- RESTful API for integration with Drupal.
Attu (Management UI)
- Web interface for managing Milvus.
- Port: 8521 (exposed by DDEV).
- Features:
- browsing collections and data,
- performance monitoring,
- index management,
- visualization of search results.
What does data flow look like in an intranet integrated with the Milvus vector database?
Data flow between Drupal, the embeddings model, and the Milvus vector database involves several key steps that together create an intelligent search process. Below, we describe how it works from the moment a query is made to the presentation of results.
Semantic search
- The user asks a question in the intranet interface.
- Drupal converts the query into a vector using the embeddings model (OpenAI text-embedding-3-small).
- The query is sent to Milvus via the ai_vdb_provider_milvus module.
- Milvus searches for similar vectors in the database.
- Milvus returns results sorted by semantic similarity.
- Drupal displays the results to the user with the title, a snippet of content, and the similarity score.
Content indexing
- A new document is added to the knowledge base on the intranet.
- Drupal automatically generates an embedding using the OpenAI API.
- The embedding is saved in Milvus along with metadata (title, URL, date).
- The document is ready for semantic search.
Read also: How We Improved the Accuracy of the RAG Chatbot's Responses by 40%
How to install Open Intranet with the Milvus RAG option? Step by step
The installation process has been simplified as much as possible thanks to a ready-made script that automatically configures all the required components. Just follow a few commands to run a full RAG demo in your environment.
Prerequisites
Before you begin, make sure you have:
- Docker Desktop — running and active.
- DDEV — installed (brew install ddev/ddev/ddev on macOS).
- OpenAI API Key — required to generate embeddings.
- Download from: https://platform.openai.com/api-keys.
- The key must start with sk-proj- or sk-.
- Cost: ~$0.01-0.10 for the entire demo.
Open Intranet RAG demo installation process
Use the following command:
git clone https://github.com/droptica/openintranet_rag_demo.git
cd openintranet_rag_demo
./launch_openintranet_with_rag_demo.shThe script automatically performs the following:
- Cloning Open Intranet from Drupal.org.
- Downloading the docker-compose configuration for Milvus VDB.
- Configuring DDEV (Drupal 11, PHP 8.3).
- Starting containers (web, db, Milvus).
- Installing Composer dependencies.
- Adding the drupal/ai_vdb_provider_milvus:^1.1@beta module.
- Copying the recipe (Drupal Recipe) openintranet_milvus_rag.
- Installation of Drupal with demo content.
- Applying the Milvus RAG recipe configuration.
- Interactive request for OpenAI API key (format validation).
- Saving the API key to the Key module in Drupal.
- Indexing Knowledge Base content to Milvus.
- Generation of a one-time login link.
During installation, you’ll be asked to paste the OpenAI API key. The script validates the format and stores it securely.
Installation verification
After completing the installation, it’s worth making sure that all elements are working correctly and communicating with each other. A few simple commands will quickly verify that indexing and semantic search are working properly.
1. Checking the index status
cd openintranet_source_code/openintranet
ddev drush search-api:statusExpected result:
knowledge_base_content Knowledge Base Content 100% 24 24
If you see 100% - everything is working!
2. Verifying the connection to Milvus
- Open Milvus Attu UI: check the port using ddev describe (search for the Attu service port).
- Connect to: http://milvus:19530.
- Find the collection: openintranet_knowledge_base.
- Check: Entity Count > 0
3. OpenAI API test
cd openintranet_source_code/openintranet
ddev drush php:eval "
$provider = Drupal::service('ai.provider')->createInstance('openai');
$result = $provider->embeddings('test', 'text-embedding-3-small', []);
echo count($result->getNormalized()) . ' dimensions';
"Expected result: 1536 dimensions

Screen with Milvus vector database running for Open Intranet
Need more technical information?
For more technical information, including detailed troubleshooting tips, see the project README on GitHub: https://github.com/droptica/openintranet_rag_demo.
How does RAG search work? Examples of use
Droptica's ready-made recipe for Drupal includes a sample RAG Search page at /search-rag-example. To test it:
- Open the page: https://your-site.ddev.site/search-rag-example.
- Enter a search query (e.g., "milvus configuration").
- Verify the display of results from:
- title (link to the source page),
- content snippet,
- similarity result.
Search example
To show how RAG works in practice, the following example illustrates the difference between traditional search and results obtained using the Milvus vector database.
User query: "how to configure access to the system."
Traditional search will only find documents containing exactly those words.
RAG search will find documents about:
- permission configuration,
- access management,
- authorization system settings,
- login instructions.
Even if the documents don’t contain the exact phrase "how to configure access to the system."
How can Milvus RAG be useful in organizations?
Milvus allows organizations to use RAG in various business scenarios, from document search to content analysis. Here are some examples.
1. Document search
Finding documents based on meaning and context rather than keywords. Example: an employee searches for "emergency procedure" and the system finds documents about "business continuity plans" and "crisis scenarios."
2. Chatbots with company knowledge
Creating chatbots with access to the organization's current knowledge. The chatbot can answer employee questions using documents from the intranet as a source of knowledge.
3. Content recommendations
Suggesting similar content to users based on semantic similarity. Example: after reading a document about "data security," the system suggests documents about "GDPR" and "privacy protection."
4. Automatic tagging
Automatically assigning tags based on document content. The system analyzes the meaning of the text and assigns appropriate categories without manual intervention.
5. Sentiment analysis
Analysis of sentiment in company content. The system can identify documents that need to be updated or those that can build a positive organizational culture.
What technologies were used in the Open Intranet + Milvus vector database demo?
Check out the detailed list of used technologies.
Drupal 11
- Version: 11.x
- PHP: 8.3
- Database: MariaDB 10.11
- Web server: nginx-fpm
Milvus
- Version: 2.5.18
- Mode: Standalone (for development)
- API: RESTful on port 19530
- Embeddings: 1536 dimensions (text-embedding-3-small)
OpenAI
- Model embeddings: text-embedding-3-small
- Dimensions: 1536
- Cost: ~$0.01-0.10 for the entire demo
DDEV
- Version: v1.24.10
- Platform: Docker Desktop
- Networking: ddev_default (external network)
Frequently asked questions (FAQ) about Milvus vector database on the intranet?
Check out the most frequently asked questions and answers about integrating Milvus with your intranet.
Does RAG require a constant internet connection for the OpenAI API?
In the demo version of the project on GitHub, a connection to the OpenAI API is required. However, the solution can be configured with other embedding models depending on the needs of the organization, e.g., with local models (Sentence Transformers) operating without an internet connection or other cloud APIs (Claude, local AI servers).
What are the costs of using the OpenAI API for embeddings?
The text-embedding-3-small model costs $0.02 per 1M tokens. For a typical knowledge base of 1,000 documents (averaging 500 words each), the indexing cost is approximately $0.10-0.50 one-time. Searching only requires generating an embedding for the query (a few words), so the costs are minimal.
Read also: How We Reduced AI API Costs by 95% with Intelligent Question Routing
How to scale the solution for a larger organization?
For larger organizations, you can:
- switch from standalone mode to Milvus cluster (multiple nodes),
- use larger MinIO instances for greater capacity,
- split etcd into separate nodes for better performance,
- add load balancers in front of the Milvus API.
Can other embedding models be used instead of OpenAI?
Yes, the ai_vdb_provider_milvus module is agnostic to the source of embeddings. You can use other providers (Claude, local models) as long as they return vectors in the correct format.
How often should the content be reindexed?
It depends on the frequency of changes in the knowledge base. For dynamic intranets with frequent updates, you can configure automatic reindexing with each content change. For more static databases, reindexing once a day or once a week is sufficient.
Does the solution work for organizations with compliance requirements (GDPR, healthcare sector)?
Yes, because all components (Drupal, Milvus, etcd, MinIO) can run on-premise, the data never leaves the organization's infrastructure. This is crucial for organizations with compliance requirements. The OpenAI API requires sending document content, so for highly sensitive data, local embedding models can be considered.
What are the hardware requirements for Milvus in standalone mode?
For small organizations (up to 10,000 documents), the following is sufficient:
- 4GB RAM
- 2 CPU cores
- 20GB disk
For larger organizations, the requirements increase proportionally to the number of documents and queries.
Milvus vector database on the intranet – summary
The integration of Milvus RAG with Open Intranet opens up new possibilities for corporate platforms. The most important benefits include:
- Intelligent search based on meaning, not just keywords.
- Better user experience in the intranet thanks to understanding context and intent.
- Scalability for organizations with large knowledge bases.
- Flexibility in expanding with additional AI features.
All components are open source, which means full control over data and no vendor lock-in. The solution is ready for production use and can be scaled according to the needs of the organization.
Do you need to implement a vector database on your intranet?
At Droptica, we design and implement AI-based solutions using LLMs, vector databases, and advanced RAG pipelines. We help you choose the right technology, integrate semantic search, create corporate chatbots, and optimize the quality of generated responses. Check out our generative AI development service and see how we can support your organization in building intelligent data-driven solutions.