Creating custom Gemini API with CI/CD

Creating custom Gemini API with CI/CD

Introduction

In this article, we will build a custom pipeline for a Flask web application designed to serve as an API for parsing electronic component attributes using the Gemini API. Below is a chart showing the key components of this web application.

The first thing you may notice is that we have Mouser API as a intermediary and there is a good reason for it to exist there.

Modern language models perform well when they have data to process, but they can produce incorrect results if there is no relevant input. My tests showed that parsing part numbers alone can lead to the model generating data that wasn’t originally there. By adding a request to Mouser, we can improve accuracy with almost no downside, such as added complexity.

This project is a great example of how to build a CI/CD pipeline. It includes two main parts: continuous integration and continuous delivery, which support an agile DevOps workflow. The pipeline makes the deployment process more transparent, automated, and reliable.

Setting Up the API

To start working with any API, the first step is to obtain access credentials. For the Gemini API, visit Google Cloud Console. Begin by creating new Google Cloud Project, then enable the Generative Language API. Finally, generate a key by creating new credentials.

Getting Mouser API key is a much simpler process. All you need to do is to create a new account and enable Search API in the Profile settings.

Testing the API

To proceed, start by cloning the source code from my public GitHub repository: Gemini-Part-Parser. Once the repository is loaded, follow the instructions in the README.md file. If you're using a modern IDE like PyCharm, you can easily execute the necessary commands directly from its built-in command line.

Before running the application make sure to fill in .env file with API credentials from the previous steps. Your environment file should look like this.

GEMINI_API_KEY=aaaa-bbbb-cccc-dddd
MOUSER_API_KEY=eeee-ffff-gggg-hhhh

You can run the tests either through the command line or using the built-in testing tool in your IDE. The project includes three tests: one for each API and a third that integrates both APIs.

The tests include assertions for both valid and invalid requests. A key challenge in creating these tests was ensuring that the Gemini API correctly identifies and rejects requests unrelated to the project's scope. This behavior is highlighted in the following line of text that specifies this scenario:

If you have been asked any question that is not related to parsing the part, then answer using response exactly as "I can only parse part attributes."

Notice that test for invalid prompt passed by returning the line of text we specified earlier.

Another approach to testing is manually executing requests using Postman. Postman is an excellent tool for simulating POST requests and analyzing the responses, making it ideal for testing APIs interactively and ensuring they function as expected.

Setting Up CI/CD with GitHub Actions

For this project we will set up an automated workflow that builds a Docker image, pushes it to a repository, and deploys Docker container to an EC2 instance.

We begin the setup by creating an Amazon EC2 instance. Since this project doesn't require a lot of computing power, a free-tier t2.micro instance will work perfectly and without any issues.

The next step is to add a new self-hosted GitHub runner to our EC2 instance. This can be done by following the instructions provided by GitHub, which vary depending on the operating system of the instance.

After installing the Actions Runner, you can start it by executing the command ./run.sh in the terminal. To keep the Actions Runner running in the background, it's recommended to install it as a service using the command sudo ./svc.sh install. This setup ensures the runner stays connected to GitHub and executes jobs triggered by GitHub Actions.

The final step is to add Secrets to your repository. This ensures sensitive information remains secure from unauthorized access. Once added, these secrets can be utilized in the pipeline. For example: environment: development we will specify the environment, and we can reference the secret value with the following line of code:

${{ secrets.DOCKERHUB_USERNAME }}.

GitHub Secrets are also useful for populating .env files when deploying Docker images, making the deployment process more secure.

We are now ready to trigger the first automatic deployment. Upon committing changes to the main branch, the Actions Runner will detect the commands specified in the GitHub Actions workflow. You can monitor the jobs being executed in real-time via the command-line interface (CLI).

GitHub provides a helpful user interface to monitor jobs while they are executed and debug any issues that may arise.

If all commands execute successfully, the production container will remain running on the EC2 instance. To view the list of running containers, type the command: docker ps -a. This will display the status and details of all containers on the instance.

Challenges and Solutions

One problem I ran into was that the Gemini API response format didn't match the expected JSON output, even though that format was specified in the prompt. Here is the sample output that we will get if we execute this code:

response = gemini_model(user_prompt=prompt)

```json
{
  "Capacitance": "0.27",
  "Capacitance Units": "uF",
  "Voltage": "100",
  "Voltage Units": "V",
  "Dielectric Type": "X7R",
  "Case Code": "0805",
  "Tolerance": "10%",
  "Mount Type": "SMD/SMT"
}
```

If we try to test it using the built-in validator in the IDE, we’ll get an error. The same issue occurs if we try to parse it directly in Python. To fix this, we include a line of code in our tests and production routes that removes any extra characters from the beginning and end of the response text.

prompt = mfg_part_num + ' ' + description
response = gemini_model(user_prompt=prompt)

# Remove extra characters from response
cleaned_response = response[8:-4]

# Validate the response (JSON format)
try:
    parsed_response = json.loads(cleaned_response)
except json.JSONDecodeError:
    return "The response is not valid JSON."

Conclusion

This application is a valuable tool for developers looking to parse part numbers for electronic components. Moreover, from this project anyone can gain hands-on experience with CI/CD pipelines, GitHub Actions, Docker Hub, and AI models. In the future, I plan to enhance the application's flexibility by allowing language model to decide which parameters to return based on the part category.