top of page

Calculating Checksums

Gradient Background

Overview

It's considered good practice when transferring any files (including downloading any executable from the internet!) to compare checksums, to ensure that the upload has succeeded, or to ensure that the file you downloaded really matches what was on the server. 

 

Malicious agents often tamper with commonly downloaded files, but any modification will change the checksum. Therefore, checksums (aka md5sums) are calculated by the file creator and published on the website.

 

For more information, and to learn how to generate a checksum, see any of the following:

 

https://knowledge.autodesk.com/search-result/caas/sfdcarticles/sfdcarticles/Checking-the-MD5-checksum-of-a-Downloaded-File.html

 

https://www.tutorialspoint.com/unix_commands/md5sum.htm

 

https://www.quickprogrammingtips.com/python/how-to-calculate-md5-hash-of-a-file-in-python.html

Checksums and Transfer Family

When a file is uploaded to S3, including over SFTP as described above, AWS computes a checksum, which is stored as the S3 "ETAG" attribute.

 

Example:

 

> aws s3api head-object --bucket mybucket --key Boston/jca-test/foo.txt | cat

{

    "AcceptRanges": "bytes",

    "LastModified": "2021-09-22T18:44:18+00:00",

    "ContentLength": 4,

    "ETag": "\"d3b07384d113edec49eaa6238ad5ff00\"",

    "VersionId": "52fyw9M_r1OA1N3wwSx6koi8SPegV0kO",

    "ContentType": "text/plain",

    "ServerSideEncryption": "AES256",

    "Metadata": {

        "user-agent": "AWSTransfer",

        "user-agent-id": "jabraham@s-1caef8d95eaf414bb"

    }

 

If you download this file and compute the checksum, it matches the above:

 

> md5 foo.txt

MD5 (foo.txt) = d3b07384d113edec49eaa6238ad5ff00

Verifying Checksums with Transfer Family

The moral of the story: don't compute checksums on the files you uploaded -- S3 does that for you.

 

In cases where you are generating and uploading large numbers of files, if you wish to verify that they were uploaded successfully, we have written a utility available on GitHub.

 

Please see the README.md for complete instructions.

bottom of page