Google
 

Saturday, September 7, 2024

Unlocking the Power of LLMs Locally with Docker Compose

Although the expectations that came with the advent of Large Language Models (LLMs) could be largely exaggerated, they still have proven to be useful in many scenarios and for a very wide audience. Probably, this is the closest non-technical users have come to interact with AI since its inception. ChatGPT has set the record for the fastest-growing user base.

Being an online tool, it comes with some concerns over privacy, customizability, and the constant need for an internet connection.
However ChatGPT is not the only player in this game. Many models are available online and for offline download. The latter is the focus of this post.

Recently, Meta release its latest model Llama 3.1. And I wanted to give it a try on my laptop, after positive feedback about it.

So, the objective I had was:

  • Find and easy way to setup the tooling required to download LLMs and get responses to my prompts locally.
  • Have a nice user interface that I can use to give prompts, save prompt history, and customize my environment.

 Two tools play very well together:

Ollama: https://ollama.com/
Think of Ollama as the npm or pip for language models. It enables you to download models, execute prompts from the CLI, list models and so on. It also provides APIs that can be called from other applications. These APIs are compatible with OpenAI Chat Completions API.

Open WebUI: https://openwebui.com/
Self-hosted web interface, very similar to what you get from OpenAI's ChatGPT web interface. Open WebUI can interface with Ollama. Think about it as a front end for the backend provided by Ollama.

 

Since I prefer to use Docker whenever possible to experiment with new tools, I opted to use it instead of installing any tools locally. Especially that I'm almost a complete beginner to this space.

Open WebUI provides docker images that include both Open WebUI and Ollama in the same image! Which makes setting up the whole stack locally super easy.

The documentation provides this example command to run the container:

docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
This command does the following:
  1. Starts the container, from the image ghcr.io/open-webui/open-webui:ollama. The --restart always parameter restarts the container automatically if it fails or was stopped.
  2. Maps the local port 3000 to Open WebUI port 8080.
  3. Maps docker volumes to paths within the container. This persists data even if the container is deleted. This is important to persist history and downloaded models, which are big.

I prefer to use docker compose, additionally I wanted to have the easy visibility on the data created by Open WebUI and the models downloaded by Ollama, so I chose to bind folders on my local machine instead of using volumes. Here is how the docker-compose.yml looks like:

services:
  OpenWebUI:
    image: ghcr.io/open-webui/open-webui:ollama
    container_name: open-webui
    environment:
      - WEBUI_AUTH=False
    volumes:
      - C:\open-webui:/app/backend/data
      - C:\ollama:/root/.ollama
    ports:
      - 3000:8080
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

Starting the stack is easy:

docker compose up -d

It may take a short while before it's ready, If you check the container logs (I use Docker Desktop on windows) you should see something similar to:

 

Then you can open your browser on http://localhost:3000/ and start playing.

Downloading models

Remember that this docker image does not include any models yet so probably the first step is to click the plus icon beside the "Select a model" label and write a model name. You'll find a list of model names in https://ollama.com/library. Click a model name, and choose the model size and the model name will be shown. So to download the 8b (8 billion parameters version) of llama3.1, write llama3.1:8b in the Open WebUI interface.

As shown in the screenshot below, I downloaded llama3.1:8b, gemma2:2b (Google's lightweight model). Note that the larger the number of parameters, the higher the specs your computer needs to have.

Testing with some prompts

After downloading models, you can try some prompts:


Note that you can interact with ollama CLI directly. Either use docker compose exec OpenWebUI bash or use docker desktop:


Now if you want to stop the container, run docker compose down

A note on GPUs

Ollama can run models using GPU, or CPU only. As you see in the docker compose file, I'm specifying that the container can use all the available GPUs on my machine. If you're using Docker desktop with WSL support, ensure that you have the latest WSL and Nvidia drivers installed. You can use this command to test that docker GPU access is working fine:

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark


Closing notes:

It's very exciting to have an LLM running locally. It opens a lot of customization possibilities and keeps your data private.
The machine I tried this experiment on is relatively old, so gemma2:2b was much faster than llama3.1:7b and still performed very well.
Looking forward to experiment more!

Note: The title of this post was recommended by gemma2 :)


Saturday, June 8, 2024

The simplest data pipeline ever

As software engineers what we care most about is getting stuff done. The simplest approach is probably the best, as long as it doesn't cause long term issues.

I was working on a project where the creation and initialization of a few PostgreSQL databases required data migration and transformation from another set of databases. The first step in this process involved loading a few hundreds of millions of records from the source databases into the destination databases.
I wanted to implement this in a way that is fully automated, and at the same time, I wanted to simplify this process as much as possible. So, no fancy tools. No cloud infrastructure. No nothing. Is that even possible?

The approach we followed was simply using Linux pipes! Linux pipes can transfer data from one process to another. So how could this help with this data import process ?

Postgres provides useful command line utilities that can be used to export and import data. For example this command exports data from a table called mydata to the standard output:

psql -h source -U postgres -d test -c "\copy mydata TO STDOUT"

On the other hand you can import data from standard input to a database table using this command:

psql -h destination -U postgres -d test -c "\copy mydata FROM STDIN"

With the power of Linux pipelines it's possible to stitch these commands together so that the output of the first command feeds into the input of the next and data flows from one database to another.

psql -h source -U postgres -d test -c "\copy mydata TO STDOUT" | psql -h destination -U postgres -d test -c "\copy mydata FROM STDIN"

Pretty simple. Isn't it? However since this operation may take a few hours, wouldn't it be nice to have some sort of progress indication ?
This is where another handy Linux utility comes to play: The pv utility. From the man page:

pv - monitor the progress of data through a pipe

By default this tool shows the number of bytes flowing through the pipe, however in the process of data import, the number of imported records is a better representative of the progress of the operation. The good thing is that pv has switches that enable counting the number of lines instead of bytes. So the final solution would look like:

psql -h source -U postgres -d test -c "\copy mydata TO STDOUT" | \ 
pv --line-mode --size 100000000 | \ 
psql -h destination -U postgres -d test -c "\copy mydata FROM STDIN"

Note that the --size parameter assumes knowledge of the total number of records, which can be retrieved using a simple select count(*), or just omitted.

When I run this in terminal, the progress looks like:


Transferring 100m records from one database to another both running as containers on my local machine took about 8 minutes. In case of transferring data over the network, it's expected to be slower.


Needless to say that in the real life implementation other steps were required like retrieving the credentials for the source and destination databases and hooking the scripts into a CI pipeline.

Surely, this isn't the most efficient way to transfer a lot of data. Note that the data is transferred as text which is far less efficient than the binary transfer that proper data import tools would use. As with any decision we make as software engineers it's all about tradeoffs. My priorities were clear: We need the simplest possible repeatable solution.

If you're interested in trying this on your machine, this is how I prepared the above screenshots:
Let's start with a docker compose file which instantiates 3 containers:

  1. A source database
  2. A destination database
  3. And a client where there data import / export process is executed.
version: "3.8"

networks:
  db-network:
    driver: bridge

services:
  source:
    image: postgres:16.1
    environment:
      - POSTGRES_PASSWORD=MYPASS123
    volumes:
      - type: volume
        source: source-data
        target: /var/lib/postgresql/data
    ports:
      - 9432:5432


  destination:
    image: postgres:16.1
    environment:
      - POSTGRES_PASSWORD=MYPASS123
    volumes:
      - type: volume
        source: destination-data
        target: /var/lib/postgresql/data
    ports:
      - 9433:5432

  client:
    container_name: postgres_client
    build: .
    entrypoint: [ "sleep", "infinity" ]


volumes:
  source-data:
    external: true
    name: source-data

  destination-data:
    external: true
    name: destination-data

 

Note that for the client container a Dockerfile is used and that is to ensure that the required utilities for this process -in particular pv- are installed. Additionally to copy the .pgpass file which contains the database passwords. 

FROM postgres:16.1

RUN apt-get update

RUN apt-get install pv

COPY pgpass /root/.pgpass

RUN chmod 0600 /root/.pgpass

source:5432:test:postgres:MYPASS123
destination:5432:test:postgres:MYPASS123

 

Then start this docker compose stack using:

docker compose up -d --build

Connect to the databases using your favorite tool (mine is Azure data studio) and create the test table by executing this query:

CREATE TABLE public.mydata (
	id int NOT NULL,
    firstname varchar NULL,
	lastname varchar NULL,
	email varchar NULL,
	CONSTRAINT mydata_pk PRIMARY KEY (id)
);

The next step would be to populate some test data into the source database:

INSERT INTO public.mydata
(firstname, lastname, email, id)
select concat('firstname', counter), concat('lastname', counter), concat('firstname', counter, '.', 'lastname', counter, '@email.com'), counter
	from pg_catalog.generate_series(1, 100000000) as counter

Then connect to the client container using:

docker compose exec client bash

And execute the script to start the data migration:

psql -h source -U postgres -d test -c "\copy mydata TO STDOUT" | \ 
pv --line-mode --size 100000000 | \ 
psql -h destination -U postgres -d test -c "\copy mydata FROM STDIN"

 

I hope this helps.

Friday, February 16, 2024

Changing log level for .net apps on the fly

Logging is very important to understand the behavior of an application. Logs can be used to analyze application behavior over an extended time period to understand trends or anomalies, but they're also critical to diagnose issues in production environments when the application is not behaving as expected.

How much logs an application should emit is a matter of tradeoffs. Writing too much logs may negatively impact application performance and increase data transfer and storage costs without adding value. Too few logs makes it very difficult to troubleshoot issues. This is why most logging frameworks allow configuring log levels so that the application developers can add as much logging as needed, but only logs with a specific level or below will actually be written to the destination.

The challenge is that you don't need all the logs all the time. You certainly can redeploy or reconfigure the application and restart it to change the log level, but this would be a bit disruptive. The good thig is that .net configuration system allows updating configuration values on the fly. Consider this simple web API:


var builder = WebApplication.CreateBuilder(args);

builder.Logging.AddConsole();

var app = builder.Build();

app.MapGet("/numbers", () =>
{
    app.Logger.LogDebug("Debug");
    app.Logger.LogInformation("Info");
    app.Logger.LogWarning("Warning");
    app.Logger.LogError("Error");

    return Enumerable.Range(0, 10);
});

app.Run();

With logging configuration file:

{
  "Logging": {
    "LogLevel": {
      "Default": "Error",
      "Microsoft.AspNetCore": "Warning"
    }
  }
}
When the /numbers endpoint is called, these logs are written to the console:
fail: ConfigReload[0]
      Error

This is clearly because the configured default log level is "Error". You can add a simple endpoint that changes the log level on the fly, like this:


app.MapGet("/config", (string level) => 
{
    if (app.Services.GetRequiredService<IConfiguration>() is not IConfigurationRoot configRoot)
        return;

    configRoot["Logging:LogLevel:Default"] = level;
    configRoot.Reload();
});

When you issue the GET request /config?level=Information Then invoke the /numbers endpoint again, the log output will look like:

info: ConfigReload[0]
      Info
warn: ConfigReload[0]
      Warning
fail: ConfigReload[0]
      Error

Similarly, to configure the log level to Debug, invoke /config?level=Debug. Very simple.

There are a few gotchas to consider:

  1. This the /config endpoint should be secured, only a privileged user should be able to invoke it as it changes the application behavior. I've intentionally ignored this in my example for simplicity.
  2. In case there are many instances serving the same API the /config invocation will be directed by the load balancer to only one instance of your application which most probably won't be sufficient. In this case you will need another approach to communicate with your application that the log level should be modified. One approach could be a pub-sub system that allows multiple consumers. This may be a subject of another blog post.

Another common approach for reconfiguring.net applications on the fly is by using a configuration source that refreshes automatically every specific time interval or based on config file change detection.
However the time based approach means that you have to wait until a certain time elapses for the application to reconfigure itself which may not be desirable as you want to change the log level as quickly as possible. A file change detection approach is not great for immutable deployments like container based applications or serverless functions.

Logging and monitoring are quality attributes that should be taken into consideration during the application design. In case you're not using a more advanced observability tooling that allow profiling for example then the technique proposed in this blog post may be of help.

Friday, January 12, 2024

Assertions of Equality and Equivalence

I remember that I encountered an interesting bug that was not detected by unit tests because the behaviour of the test framework did not match my expectations.
The test was supposed to verify that the contents of an array (or a list) returned by the code under test match an expected array of elements in the specific order of that expected array. The unit test was passing, however, later the team discovered a bug, and the root cause was that the array was not in the correct order! This is exactly why we write automated tests, but the test failed us.

The test, which uses FluentAssertions library basically looked like:

[Test]
public void FluentAssertions_Unordered_Pass()
{
	var actual = new List<int>  {1, 2, 3}; // SUT invocation here
	var expected = new [] {3, 2, 1};

	actual.Should().BeEquivalentTo(expected);
}
Although the order of the elements of the actual array don't match the expected, the test passes. This is not a bug in FluentAssertions. It's by design, and the solution is simple:
actual.Should().BeEquivalentTo(expected, config => config.WithStrictOrdering());

 

The config parameter enforces a specific order of the collection. It's also possible to configure this globally, when initializing the test assembly for example:

AssertionOptions.AssertEquivalencyUsing(config => config.WithStrictOrdering());

 

The default behavior of this method annoyed me. In my opinion, the test method should be strict by default. That is, it should assume that the collection should be sorted, and can be made more lenient by overriding this behavior. Not the opposite.

Probably I got into the habit of using BeEquivalentTo(), while an Equal() assertion exists, which "Expects the current collection to contain all the same elements in the same order" as it's default behavior. There are other differences between BeEquivalentTo() and Equal() that don't matter in this context. 

Similar behavior applies to Nunit assertions, although there is no way to override the equivalence behavior:

[Test]
public void NUnit_Unordered_Pass()
{
	var actual = new [] {1, 2, 3};
	var expected = List<int>  {3, 2, 1};

	Assert.That(actual, Is.EquivalentTo(expected)); // pass
	CollectionAssert.AreEquivalent(expected, actual); // pass
}
[Test]
public void NUnit_Unordered_Fail()
{
	var actual = new [] {1, 2, 3};
	var expected = new List<int> {3, 2, 1};

	Assert.That(actual, Is.EqualTo(expected)); // fail
	CollectionAssert.AreEqual(expected, actual); // fail
}

 

It's important to understand the behavior of the testing library to avoid similar mistakes. We rely on tests as our safetly net, and they better be reliable!

Friday, September 22, 2023

Handling special content with Handlebars.net Helpers

Generating formatted reports based on application data is a very common need. For example, you may want to create an HTML page with content from a receipt. This content may be sent in an HTML formatted email or converted to PDF or any other use case. To achieve this, a flexible and capable templating engine is needed to transform the application data to a human readable format.
.net has a very powerful templating engine that's used in its asp.net web framework which is Razor templates. But what if you want to use a templating engine that is simpler, and doesn't require a web stack as in the case of building background jobs, desktop or mobile applications?

 


Handlebars.net is a .net implementation of the famous HandlebarsJS templating framework. From Handlebars.net Github repository:

"Handlebars.Net doesn't use a scripting engine to run a Javascript library - it compiles Handlebars templates directly to IL bytecode. It also mimics the JS library's API as closely as possible."
For example: consider this collection of data that should be rendered as an HTML table:

var employees = new [] 
{
    new Employee
    {
        BirthDate= DateTime.Now.AddYears(-20),
        Name = "John Smith",
        Photo = new Uri("https://upload.wikimedia.org/wikipedia/commons/thumb/2/29/Houghton_STC_22790_-_Generall_Historie_of_Virginia%2C_New_England%2C_and_the_Summer_Isles%2C_John_Smith.jpg/800px-Houghton_STC_22790_-_Generall_Historie_of_Virginia%2C_New_England%2C_and_the_Summer_Isles%2C_John_Smith.jpg")
    },
    new Employee
    {
        BirthDate= DateTime.Now.AddYears(-25),
        Name = "Jack",
        Photo = new Uri("https://upload.wikimedia.org/wikipedia/commons/e/ec/Jack_Nicholson_2001.jpg")
    },
    new Employee
    {
        BirthDate= DateTime.Now.AddYears(-40),
        Name = "Iron Man",
        Photo = new Uri("https://upload.wikimedia.org/wikipedia/en/4/47/Iron_Man_%28circa_2018%29.png")
    },
};

A Handlebars template may look like:

<html>
<body>
	<table border="1">
		<thead>
			<tr>
				<th>Name</th>
				<th>Age</th>
				<th>Photo</th>
			</tr>
		</thead>
		<tbody>
			{{#each this}}
			<tr>
				<td>{{Name}}</td>
				<td>{{BirthDate}}</td>
			</tr>
			{{/each}}
		</tbody>
	</table>

</body>
</html>

The template is fairly simple. Explaining the syntax of Handlebars templates is beyond the scope of this article. Check Handlebarjs Language Guide for information regarding its syntax.

Passing the data to the Hanledbar.net and render the template is easy:

var template = File.ReadAllText("List.handlebars");
var compiledTemplate = Handlebars.Compile(template);
var output = compiledTemplate(employees);

Console.WriteLine(output);

Line 1 reads the List.handlebars template which is stored in the same application folder, alternatively the template can be stored as an embedded resource or retrieved from a database or even created on the fly.
Line 2 compiles the template, generating a function that can be invoked later. 

Note: For good performance, the compiled template should be generated once and used multiple times during the lifetime of the application.

Line 3 invokes the function passing the employees collection and receives the rendered output in a string variable.

This is the generated HTML:

<html>
<body>
	<table border="1">
		<thead>
			<tr>
				<th>Name</th>
				<th>Age</th>
				<th>Photo</th>
			</tr>
		</thead>
		<tbody>
			<tr>
				<td>John Smith</td>
				<td>2003-09-09T22:08:23.3541971+10:00</td>
				<td><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/29/Houghton_STC_22790_-_Generall_Historie_of_Virginia%2C_New_England%2C_and_the_Summer_Isles%2C_John_Smith.jpg/800px-Houghton_STC_22790_-_Generall_Historie_of_Virginia%2C_New_England%2C_and_the_Summer_Isles%2C_John_Smith.jpg" width="200px" height="200px" /></td>
			</tr>
			<tr>
				<td>Jack</td>
				<td>1998-09-09T22:08:23.3839317+10:00</td>
				<td><img src="https://upload.wikimedia.org/wikipedia/commons/e/ec/Jack_Nicholson_2001.jpg" width="200px" height="200px" /></td>
			</tr>
			<tr>
				<td>Iron Man</td>
				<td>1983-09-09T22:08:23.3839479+10:00</td>
				<td><img src="https://upload.wikimedia.org/wikipedia/en/4/47/Iron_Man_%28circa_2018%29.png" width="200px" height="200px" /></td>
			</tr>
		</tbody>
	</table>

</body>
</html>

And this is how the output is rendered by a browser:


Putting aside lack of styling which has nothing to do with Handlebars, the output seems good but suffers for two issues:

  1. The format of the Age property is not great.
  2. The image tags rendered by the template reference the full URL of the images. Every time the generated HTML is consumed and rendered, it will have to fetch the images from their sources, which may be inconvenient. Additionally, the generated template is not self-contained, and other services that consume the generated HTML (like an HTML to PDF conversion service) will have to download the images.

Although the Handlebars has a powerful templating language, it's impossible to cover all needs that may arise, this is why Handlebars.net provides the ability to define custom helpers.
 

Custom Helpers: 

Helpers provide an extensibility mechanism to customize the rendered output. Once created and registered with Handlebars.net, they can be invoked from templates as if they were part of Handlebar's templating language.
Let's use helpers to solve the date format issue:
Handlebars.RegisterHelper("formatDate", (output, context, arguments)
                => { output.Write(((DateTime)arguments[0]).ToString(arguments[1].ToString())); });

This one-line registers a formatDate helper that takes the first argument and formats it using the second argument. To call this helper in the template:

<td>{{formatDate BirthDate "dd/MM/yyyy"}}</td>

The rendered output is much better now:


Embedding images in the HTML output

To solve the second issue mentioned above, we can write a custom helper to embed image content using the data URI scheme.
This is a basic implementation of this "embeddedImage" helper:

Handlebars.RegisterHelper("embeddedImage", (output, context, arguments) =>
{
    var url = arguments[0] as Uri;
    using var httpClient = new HttpClient();

    // add user-agent header required by Wikipedia. You should safely ommit the following line for other sources
    httpClient.DefaultRequestHeaders.UserAgent.Add(new ProductInfoHeaderValue("example.com-bot", "1.0"));

    var content = httpClient.GetByteArrayAsync(url).Result;
    var encodedContent = Convert.ToBase64String(content);
    output.Write("data:image/png;base64," + encodedContent);
});

The code uses an HttpClient to download the image as a byte array, then encode it using base64 encoding, then writes the output as a data URI using the standards format. And the usage is very simple:

<img width="200px" height="200px" src="{{embeddedImage Photo}}"  />

And the HTML output looks like: (trimmed for brevity)

<img width="200px" height="200px" src=".....

 

Conclusion

One of the most important design principals is the Open-Closed Principal: software entities should be open for extension but closed for modification. Handlebars and Handlebars.net apply this principal by allowing users to extend the functionality of the library without having to modify its source code, which is a good design.
With a plethora of free and commercial libraries available for developers, the level of extensibility should be one of the evaluation criteria used during the selection process.
And you, what other templating libraries have you used in .net applications? How extensible are these libraries?

Friday, June 30, 2023

Mind games of measurements and estimates: Hidden meanings behind numbers and units


I'm a fan of science and nature documentaries. A few years ago, National Geographic Abu Dhabi was my favorite channel. It primarily featured original NatGeo content, which was dubbed in Arabic.
The content variety and interesting topics from construction, to wild life, air crash investigations and even UFO; provided me with a stream of knowledge and enjoyment. But in some times, also confusion!

One source of confusion was the highly accurate numbers used to describe things that normally could not be measured to that level of accuracy!
In one instance, a wild animal was described to have a weight reaching something like 952 kilograms. Not 900, not 1000 or even 950, but exactly 952.
In another instance, a man was describing a flying object, and he mentioned that the altitude of that object was 91 meters. That man must have laser distance meters in his eyes!

When I thought about this, I figured out that probably while translating these episodes, units of measurements were converted from pounds to kilograms, from feet and yards to meters, and from miles to kilometers, and so on. This is because the metric system is used in the Arab world and is more understandable by the audience.
Converting the above numbers back to the original units made them sound more logical. The wild animal weighed approximately 2200 pounds, and the man was describing an object flying about 100 yards or 300 feet high. That made much more sense.

But why did these round figure numbers seem more logical and more acceptable when talking about things that cannot be accurately measured? After all, 2200 pound are equal to 952 kilograms, and 100 yards are 91.44 meters. Right?

Apparently, the way we perceive numbers in casual conversations implicitly associates an accuracy level.
This Wikipedia note gives an example of this:
"Sometimes, the extra zeros are used for indicating the accuracy of a measurement. For example, "15.00 m" may indicate that the measurement error is less than one centimetre (0.01 m), while "15 m" may mean that the length is roughly fifteen metres and that the error may exceed 10 centimetres."

Similarly, smaller units can be used to give a deceiving indication of accuracy. A few years ago, I was working with a colleague on a high level estimates of a software project. We used weeks as our unit of estimate because -as expected- we knew very little about the project and we expressed this in terms of coarse-grained estimates.
From experience, we knew that this level of accuracy won't be welcome by who requested the estimates, and they may want to get more accurate ones. I laughingly told my colleague: "If they want the estimates in hours, they can multiply these numbers by 40!". I feel I was mean saying that. Of course the point was the accuracy, not the unit conversion.

One nice thing about using Fibonacci numbers in relative estimates, is that they detach the numeric estimates from any perceived accuracy. When the estimate is 13 story points, it's totally clear that the only reason why it's 13, - not 12 or 14  for example- is not because we believe it to be accurately 13. It's just because we don't have the other numbers on the estimation cards. It's simply a best guess.

Beware of the effects of units and numbers you use. They may communicate more than what you originally intended.

Wednesday, May 10, 2023

Setting exit code of a .net worker application

When building a .net worker application with a hosted service based on the BackgroundService class, it's some times it's required to set the application exit code based on the outcomes of the execution of the hosted service.

One trivial way to do this is to to set the Environment.ExitCode property from the hosted service:


public class Worker : BackgroundService
{
    public Worker()
    {

    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        try
        {
            throw new Exception("Something bad happened");
        }
        catch
        {
            Environment.ExitCode = 1;
        }
    }
}

This works, however consider these unit tests:


[Test]
public async Task Test1()
{
    Worker sut = new Worker();
    await sut.StartAsync(new CancellationToken());

    Assert.That(Environment.ExitCode, Is.EqualTo(1));
}

[Test]
public void Test2()
{
    // another test
    Assert.That(Environment.ExitCode, Is.EqualTo(0));
}

Test1 passes, however Test2 fails as Environment.ExitCode is a static variable. You can reset back to zero it after the test, but this is error-prone. So what is the alternative?

One simple solution is to use a status code-holding class as a singleton and inject it into the background service:


public interface IStatusHolder
{
    public int Status { get; set; }
}

public class StatusHolder : IStatusHolder
{
    public int Status { get; set; }
}

public class Worker : BackgroundService
{
    private readonly IStatusHolder _statusHolder;

    public Worker(IStatusHolder statusHolder)
    {
        _statusHolder = statusHolder;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        try
        {
            throw new Exception("Something bad happened");
        }
        catch
        {
            _statusHolder.Status = 1;
        }
    }
}

As simple Program.cs would look like:


using EnvironmentExit;

IHost host = Host.CreateDefaultBuilder(args)
    .ConfigureServices(services =>
    {
        services.AddHostedService<Worker>();
        services.AddSingleton<IStatusHolder, StatusHolder>();
    })
    .Build();

host.Start();

var statusHolder = host.Services.GetRequiredService<IStatusHolder>();
Environment.ExitCode = statusHolder.Status;

Note that line number 8 registers IStatusHolder as a singleton, which is important to maintain its state.

Now all tests pass. Additionally, when the application runs, the exit code is 1.