Google
 

Friday, January 27, 2023

PowerShell core compatibility: A lesson learned the hard way

PowerShell core is my preferred scripting language. I've been excited about it since its early days. Here's a tweet from back in 2016 when PowerShell core was still in beta:

 

I've used PowerShell to automate build steps, deployments, and other tasks on both dev environments and CICD pipelines. It's great to write a script on my Windows machine, test it using PowerShell core, and run it on my docker Linux-based build environments with 100% compatibility. Or so I thought until I learned otherwise!

A few years ago, I was automating a process which required creating a folder if it didn't exist. Out of laziness, this is how I implemented this functionality: 

mkdir $folder -f

When the folder exists and the -f (or --Force) flag is passed, the command will return the existing directory object without errors. I know this is not the cleanest way -more on this later- but it works on my Windows machine, so it should also work in the docker Linux container, except that it didn't. When the script ran, it resulted in this error:

/bin/mkdir: invalid option -- 'f'
Try '/bin/mkdir --help' for more information.

Why did the behavior differ? It turns out that mkdir means different things depending on whether you're running PowerShell on Windows or Linux. And this can be observed using Get-Command Cmdlet:

# Windows:
Get-Command mkdir

The output is:

CommandType     Name                                               Version
-----------     ----                                               -------
Function        mkdir

Under Windows, mkdir is a function, and the definition of this function can be obtained using

(Get-Command mkdir).Definition

And the output is:

<#
.FORWARDHELPTARGETNAME New-Item
.FORWARDHELPCATEGORY Cmdlet
#>

[CmdletBinding(DefaultParameterSetName='pathSet',
    SupportsShouldProcess=$true,
    SupportsTransactions=$true,
    ConfirmImpact='Medium')]
    [OutputType([System.IO.DirectoryInfo])]
param(
    [Parameter(ParameterSetName='nameSet', Position=0, ValueFromPipelineByPropertyName=$true)]
    [Parameter(ParameterSetName='pathSet', Mandatory=$true, Position=0, ValueFromPipelineByPropertyName=$true)]
    [System.String[]]
    ${Path},

    [Parameter(ParameterSetName='nameSet', Mandatory=$true, ValueFromPipelineByPropertyName=$true)]
    [AllowNull()]
    [AllowEmptyString()]
    [System.String]
    ${Name},

    [Parameter(ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)]
    [System.Object]
    ${Value},

    [Switch]
    ${Force},

    [Parameter(ValueFromPipelineByPropertyName=$true)]
    [System.Management.Automation.PSCredential]
    ${Credential}
)

begin {
    $wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('New-Item', [System.Management.Automation.CommandTypes]::Cmdlet)
    $scriptCmd = {& $wrappedCmd -Type Directory @PSBoundParameters }

    $steppablePipeline = $scriptCmd.GetSteppablePipeline()
    $steppablePipeline.Begin($PSCmdlet)
}

process {
    $steppablePipeline.Process($_)
}

end {
    $steppablePipeline.End()
}

Which as you can see, wraps the New-Item Cmdlet. However under Linux, it's a different story:

# Linux:
Get-Command mkdir

Output:

CommandType     Name                                               Version
-----------     ----                                               -------
Application     mkdir                                              0.0.0.0

It's an application, and the source of this applications can be retrieved as:

(Get-Command mkdir).Source
/bin/mkdir

Now that I know the problem, the solution is easy:

New-Item -ItemType Directory $folder -Force

It's generally recommended to use Cmdlets instead of aliases or any kind of shortcuts to improve readability and portability. Unfortunately PSScriptAnalyzer - which integrates well with VSCode- will highlight this issue in scripts but only for aliases (like ls) and not for functions. AvoidUsingCmdletAliases.

I learned my lesson. However, I did it the hard way.

Sunday, June 5, 2022

Reading a file from a Docker container in .net core

In many situations it might be needed to read files from a docker container using .net code.
Docker.DotNet library is very useful to interact with docker from .net. And it provides a useful method (GetArchiveFromContainerAsync) to read files from a docker container.
When I tried to use this method to read a small csv/text file, the file content looked weird a bit. It seemed like there was an encoding issue!

When I checked the code on Github, I found that the returned data is a tarball stream. Which makes sense as Docker documentation mentions that the returned stream is a Tar stream.

To read the Tar stream, I tried to use SharpZipLib library's TarInputStream class. However, that didn't work as apparently the library requires a seekable stream while the stream contained in the GetArchiveFromContainerResponse returned from the method is not.
The workaround -which works well for relatively small files- is to copy the stream to a memory stream and use that instead.

This is a sample code:

DockerClientConfiguration config = new();
using var client = config.CreateClient();

GetArchiveFromContainerParameters parameters = new()
{ 
	Path = "/root/eula.1028.txt"
};
var file = await client.Containers.GetArchiveFromContainerAsync("example", parameters, false);

using var memoryStream = new MemoryStream();
file.Stream.CopyTo(memoryStream);
file.Stream.Close();

memoryStream.Seek(0, SeekOrigin.Begin);

using var tarInput = new TarInputStream(memoryStream, Encoding.ASCII);
tarInput.GetNextEntry();

using var reader = new StreamReader(tarInput);

var content = reader.ReadToEnd();

Console.WriteLine(content);

I hope this helps!

Saturday, September 19, 2020

Burnout

 

image via Peakpx
 I recently listened to an interesting podcast about burnout that stimulated some thoughts regarding this silent killer that could easily get rampant, especially in the software industry which is known to be very mentally demanding.

This industry attracts very passionate persons who -given an interesting enough problem- will voluntarily give up a lot of their time, energy and other aspects of their social and health lives.

While seeking the satisfaction of solving complex problems or under tight delivery pressure, developers "get into the zone" and spend extended hours without even noticing.

Commonly, developers take pride in this aspect of their work. Other developers consider this as a role model for how a dedicated developer should be. Managers celebrate heroic efforts of their developers and even more take it for granted and it become a normal expectation.

But what's wrong with this? If the developer is really passionate about his/her work, so what?

One of the light bulb moments in this podcast is when Dr Aneika (PhD  in Organizational Behavior and Human Resources) said:

 "…you would think that some research or previous research said, well, maybe engagement is the antonym to burnout. But no, what we really found out is that people that are really, really engaged are the ones that are most susceptible to burnout"

"…to be a great developer, to be a great programmer, or to be a great coder, you have to really be involved. And that involvement that takes you in and sucks you in could be the same thing that can lead you down the road of burnout."

No surprise then that developers could go through waves of extreme productivity followed by low performance, if not conscious enough to how their mind and emotions work.

Another important aspect to consider especially if you're a leader in tech is the impact of your burnout on how you interact with those who you lead.

One component of burnout is depersonalization, that is when you're burnt out, you get detached from the surrounding team members, and focus only on what you get out of them. To you, they become more like functions with inputs and outputs, and your relationship becomes merely transactional, which is very dangerous.

To me, one of the most important leadership traits is empathy. When you're drained to the extent that you have no emotional capacity for empathy, you lose the ability to connect and support your team members. And especially if you're normally understanding and supportive, your fluctuating behaviour might hurt the trust you've earned.

Take care of the signs of burnout. And remember not to deplete all your energy before taking the time to recharge.

Saturday, January 4, 2020

Which language should I speak?

Working in a diverse environment with team members from many nationalities is a great experience. You get to know new cultures and recognize how similar people are across the world although the seemingly extreme differences.
In such an environment, you hear different languages all the time! And although there is usually a de facto business language, -English in my case, since I'm currently working in Australia-, some people prefer to have conversations in their native tongue with colleagues that share the same language even in a business context.

Well, is that OK?
There are many angles from which I see this matter.

It's good to feel natural

As a non-native English speaker myself, I feel very weird speaking with my Arabic speaking colleagues -especially Egyptians- in a secondary language, it just doesn't feel natural! Why speak in a language that we wouldn't normally use if we were having a casual chat? Put aside losing access to a huge stock of vocabulary and expressions that we share. This leads to the second point:

It's about effective communication

We need to get the job done, right? So why put a barrier in front of effective communication? Undoubtedly using my native language makes conveying my thoughts much easier. Besides, it gives better control over the tone of the conversation. I suppose the same goes for other nationalities as well.

But what are we missing?

Some people might feel excluded when others around them speak in a language they don't understand. However, I haven't seen this causing real issues.

A virtual wall?

I've been working in Agile teams for years. And I believe in the value of having collocated teams in facilitating communication. 
It happened many times that I overheard a discussion between other colleagues in my team area when I jumped in and gave help to solve an issue, guided on a topic, or threw in a piece of information that was necessary to solve a problem. Even if you're not intentionally paying attention, it's possible to save the team from consuming a lot of time going in circles.
Speaking in a different language defies the purpose of collocation and creates virtual walls. It's the same reason why some Agile practitioners recommend not putting headphones as they isolate the team member from the surrounding team interactions.

What about you? Do you prefer speaking in your first language if different from the common one used at work? On the other side, how do you feel about other colleagues speaking in a language that you don't understand?

Friday, April 26, 2019

Using Git hooks to alter commit messages

As developers we try to get the repetitive boring stuff out of our ways. Hence we try to use tools that automate some of our workflows, or if no tools is available for our specific needs, no problem, we automate them ourselves, we're developers after all!

In one of the projects I worked on, there was a convention to add the task id as part of each commit message because some tools are used to generate reports based on it. I'm not sure why this was required in that situation, but I had to follow the convention anyway. Since I tend to make many small commits every day, I was sure I'll forget to add the task id most of the time. So I started investigating Git hooks.

Git provides many hooks that could be used to automate some of the repetitive behaviors that are required to happen with the different life cycle steps of Git usage. For example:
    • Pre-commit
    • Pre-push
    • Prepate-commit-message
    • Commit-message

The folder ".git/hooks" within the git repository folder contains many sample commit hook files which are good starting points. The one of interest in this case was the commit-msg hook.

In my scenario, we had a convention to name our branches using the patterns "feature/" or "bug/".

So I decided to deduce the task id from the branch name and prepend it to the commit message.
I created a file with the name commit-msg in the .git/hooks folder, the code inside this file is similar to:

#!/bin/sh
message=$(cat $1)
branch=$(git branch | grep \* | cut -d ' ' -f2-)
task=$(echo $branch | cut -d / -f2-)
echo "$task - $message" > $1
  • Line 2: reads the original commit message from the temp file, whose name is passed as the first parameter to the script.
  • Line 3: reads the current branch name. Thanks to StackOverflow.
  • Line 4: extracts the task id from the branch name by splitting the string by the "/" character and taking the second part.
  • Line 5: overwrites the commit message with the required format.

Now when I commit code using:
git commit -m"test message"
And then inspect the logs using git log command, the commit message is modified as needed:
commit f1fe8918c754ca89649a2a86ef4ab0a9a53c0496 (HEAD -> feature/1234)
Author: Hesham A. Amin
Date:   Fri Apr 26 08:24:40 2019 +0200

    1234 - test message

commit 4e3e180d3a27772a32230bf6dbbd039b949dc30e
...

Investing few minutes to automate daunting repetitive tasks pays off on the long term.

Thursday, December 27, 2018

Removing the Server header from Kestrel hosted ASP.NET core apps

In the continuous battle of software builders against attackers, the less information the application discloses about its infrastructure the better.
One of the issues I've repetitively seen in penetration testing reports for web applications is the existence of the Server header, which as mentioned in MDN:

The Server header contains information about the software used by the origin server to handle the request.

Also as mentioned by MDN:

Overly long and detailed Server values should be avoided as they potentially reveal internal implementation details that might make it (slightly) easier for attackers to find and exploit known security holes.

By default, when using Kestrel web server to host an ASP.NET core application, Kestrel returns the Server header with the value Kestrel as shown in this screenshot from Postman:

Even though it doesn't sound like a big security risk, I just prefer to remove this header. This could be achieved by adding this line to the ConfigureServices method in the application Startup class:
services.PostConfigure(k => k.AddServerHeader = false);

The PostConfigure configurations run after all Configure methods. So it's a good place to override the default behavior.

Sunday, September 24, 2017

Azure Event Grid WebHooks - Retries (Part 3)

Building distributed systems is challenging. If not carefully designed and implemented, a failure in one component can cause cascading failures that affect the whole system. That's why patterns like Retry and Circuit Breaker should be considered to improve system resilience. In case of sending WebHooks the situation might be even worse as your system is calling a totally external system with no availability guarantees and over the internet which is less reliable than your internal network.
Continuing on the previous parts of this series (Part 1, Part 2) I'll show how to use Azure Event Grid to overcome this challenge.

Azure Event Grid Retry Policy

Azure Event Grid provides a built-in capability to retry failed requests with exponential backoff, which means that in case the WebHook request fails, it will be retried with increased delays.
As per the documentation failed requests will be retried after 10 seconds, and if the request fails again, it will keep retrying after 30 seconds, 1 minute, 5 minutes, 10 minutes, 30 minutes, and 1 hour. However these numbers aren't exact intervals as Azure Event Grid adds some randomization to these intervals.
Events that take more than 2 hours to be delivered will be expired. This duration should be increased to 24 hours after the preview phase.
This behavior is not trivial to implement which adds to the reasons why using a service like Azure Event Grid should be considered as an alternative to implementing it's capabilities from scratch.

Testing Azure Event Grid Retry

To try this capability and building on the example used in Part 1, I made a change to the AWS Lambda function that receives the WebHook to introduce random failures:

public object Handle(Event[] request)
{
    Event data = request[0];
    if(data.Data.validationCode!=null)
    {
        return new {validationResponse = data.Data.validationCode};
    }

    var random = new Random(Guid.NewGuid().GetHashCode());
    var value = random.Next(1 ,11);

    if(value > 5)
    {
        throw new Exception("Failure!");
    }

    return "";
}

Lines 9-15 produce almost 50% failure rate. When I pushed an event (as shown in the previous posts) to a 1000 WebHook subscribers, the result was the below chart depicting the number of API calls per minute and number of 500 errors per minute:


Number of requests per minute (Blue) - Number of 500 Errors per minute (Orange)

We can observe the following:
  • The number of errors (orange) is almost half the number of requests (blue)
  • Number of requests  per minute is around 1500 for the first minute. My explanation is that since we have 1000 listeners and 50% failure rate, Azure has made extra 500 requests.
  • After a bit less than 2 hours (not shown in the chart for size constraints) the number of errors has dropped to 5 and no more requests were made. This is due to the expiration period during the preview.

Summary

Azure Event Grid is a scalable and resilient service that can be used in case of handling thousands (maybe more) of WebHook receivers. Whether your solution is hosted on premises or on Azure, you can use this service to offload a lot of work and effort.
I wish that Azure Event Grid could give some insights on how events are pushed and received which would help a lot in troubleshooting as the subscriber is usually not under your control. I hope this will become an integrated part of the Azure portal.
It's worth mentioning that other cloud providers support similar functionality as Event Grid that are worth checking, specifically Amazon Simple Notification Service (SNS) and Google Cloud Pub/Sub. Both have overlapping functionality with Azure Event Grid.