Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify on CI that gleam binary architectures match target architectures #3897

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

diemogebhardt
Copy link

@diemogebhardt diemogebhardt commented Nov 26, 2024

This PR is regarding issue #1630:

  • Adds a bash script to verify that the binary architecture matches the target architecture:
    • ./bin/verify-binary-architecture.sh
  • Adds CI workflows to test the bash script on a wider variety of target architectures and OSes, than the release binaries are actually being built for. They mirror the existing workflows up to the building/installing part. But they remove all the safety precautions that restrict them to certain preconditions or branches! It wouldn't be possible to test the bash script otherwise – everything works as expected in my fork. The workflows are configured to be triggered manually:
    • ./.github/workflows/ci-verify-binary-architecture.yaml
    • ./.github/workflows/release-nightly-verify-binary-architecture.yaml
    • ./.github/workflows/release-verify-binary-architecture.yaml
  • Adds CI workflow binary architecture verification steps to:
    • ./.github/workflows/ci.yaml
    • ./.github/workflows/release-nightly.yaml
    • ./.github/workflows/release.yaml

How it works:

  • The bash script takes in the target architecture and the binary path as arguments and does it's thing
  • The CI workflow steps construct the corresponding binary path and pass them to the bash script

When reviewing:

  • It's probably best to review the changed files instead of the individual commits and skip the ./.github/workflows/*-verify-binary-architecture.yaml workflows in the first pass, as they are adding a lot of noise – and focus on the bash script and the changes to the existing workflows instead.

Notes and Todos:

  • In the initial version of the bash script I used associative arrays which led to beatiful and straightforward code. Sadly they aren't supported on darwin, so I had to refactor it using case statements which makes it a lot more verbose. I added comments to make the parts easier to understand. Instead of commenting, I usually prefer making code self explanatory by extracting things – in this case it felt better to keep everything colocated and add comments.
  • Decide on whether to remove the new test CI workflows before merging? Or keep them around for future testing?
  • Document the file outputs for the various target architectures and OSes somewhere for future reference?

Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Why is it that there are new workflows for this? There is a very large amount of extra work being performed and CI complexity to maintain here. I would have expected 1 extra step in each workflow to run the check command.

@@ -0,0 +1,75 @@
#!/usr/bin/env bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regular shell please 🙏

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, ok – shall we change it for all the other scripts as well then?

shell: bash is being used for all the inline scripts of the workflow steps. That's why I went for bash to not mix and match shells.

;;
*"windows"*)
# Parse binary architecture
pe_header_output=$(powershell -Command "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is file unavailable on the Windows runners, even though it has a posix shell installed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid not. Same is true for which vs where. It's been a pain getting this to work for Windows in the least verbose way.

That's the reason why I opted for creating separate workflows. So that I can test an even broader set of target architectures and OSes and make it work:

target:
  - x86_64-unknown-linux-gnu
  - aarch64-unknown-linux-gnu
  - x86_64-unknown-linux-musl
  - aarch64-unknown-linux-musl
  - x86_64-apple-darwin
  - aarch64-apple-darwin
  - x86_64-pc-windows-msvc
  - aarch64-pc-windows-msvc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I added an isolated file output test workflow step on a separate testing branch:

- name: Test file output
  shell: bash
  run: |
    set -xeuo pipefail

    BINARY_PATH="target/${{ matrix.target }}/release/gleam"
    if [[ "${{ matrix.target }}" == *"windows"* ]]; then
        BINARY_PATH="${BINARY_PATH}.exe"
    fi

    file_output=$(file "${BINARY_PATH}")

And ran it on the various target architectures and OSes:

target/x86_64-unknown-linux-gnu/release/gleam: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=7fa5e3ea50b409f61625cfaa5cf439a30de76936, for GNU/Linux 3.2.0, not stripped

target/aarch64-unknown-linux-gnu/release/gleam: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, BuildID[sha1]=75394ba54846e1cd0654df8f49cf615896418733, not stripped

target/x86_64-unknown-linux-musl/release/gleam: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), static-pie linked, BuildID[sha1]=1892a4eef798936540c664af5c9f3f96d7253d6f, not stripped

target/aarch64-unknown-linux-musl/release/gleam: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, not stripped

target/x86_64-apple-darwin/release/gleam: Mach-O 64-bit executable x86_64

target/aarch64-apple-darwin/release/gleam: Mach-O 64-bit executable arm64

target/x86_64-pc-windows-msvc/release/gleam.exe: PE32+ executable (console) x86-64, for MS Windows, 5 sections

target/aarch64-pc-windows-msvc/release/gleam.exe: PE32+ executable (console) Aarch64, for MS Windows, 11 sections

Seems like I was wrong – it works - sorry! Not sure why it hasn't previously. Let me fix that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oups, I figured out why it appeared to not work when I did this initially... 🤦

Anyways: refactored the bash script to use file on Windows too in commit 425e8fb.

file_output=$(file -b "${BINARY_PATH}")
BINARY_ARCHITECTURE=$(echo "${file_output}" | grep -o "x86_64\|arm64" || echo "")
# Map expected binary architecture
case "${TARGET_ARCHITECTURE}" in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's pass the expected substring in from the workflow rather than having mappings from the target triple to these strings here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected substring is different for each target architecture and OS.

In my initial version I had it mapped like so:

declare -A ARCHITECTURE_PATTERNS=(
  ["darwin"]="x86_64\|arm64"
  ["linux"]="x86-64\|aarch64"
)
declare -A ARCHITECTURE_MAP=(
  ["darwin:x86_64"]="x86_64"
  ["darwin:aarch64"]="arm64"
  ["linux:x86_64"]="x86-64"
  ["linux:aarch64"]="aarch64"
  ["windows:x86_64"]="X64"
  ["windows:aarch64"]="Arm64"
)

Which reduced the logic to just:

BINARY_ARCHITECTURE=$(file "${BINARY_PATH}" | grep -o "${ARCHITECTURE_PATTERNS["linux"]}" || echo "")
EXPECTED_BINARY_ARCHITECTURE="${ARCHITECTURE_MAP["linux:${TARGET_ARCHITECTURE}"]:-}"

Way more clear – it sadly doesn't run on darwin.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope its more clear now in the refactored version of the script?

@diemogebhardt
Copy link
Author

Thank you!

Why is it that there are new workflows for this? There is a very large amount of extra work being performed and CI complexity to maintain here. I would have expected 1 extra step in each workflow to run the check command.

Let me quote some bits from above:

It wouldn't be possible to test the bash script otherwise – everything works as expected in my fork.

Decide on whether to remove the new test CI workflows before merging? Or keep them around for future testing?

I just added them for testing purposes – wanted to make sure this works in the exact environment it is supposed to be run: GitHub runners.

This way:

  • We are able to mirror the exact behavior of the ci, release and release-nightly workflows without running all the additional stuff that those workflows do after building the binary and verifying its architecture;
  • without having to meet the workflows prerequisites and constraints like:
    • release-nightly being constrained to run: if: ${{ github.repository_owner == 'gleam-lang' }}
    • release only on: push: tags: "v*";
  • able to test for even more target architectures and OSes to make the script more universal. Resulting in higher chances to not break the script when adding additional architectures and OSes later on;
  • can run the workflows in the gleam repo in the context of this PR and verify it works prior to merging.

I wanted to keep the changes to the existing workflows as minimal as possible.

…t architecture to use file command on Windows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants