Back to Publications

Static Analysis of macOS Application Bundles for Territorial Affiliation

    Tech Note
  • Software Analysis
  • Security

Introduction

Due to the full-scale invasion of the russian federation on February 24, 2022, on the territory of Ukraine, it became necessary to strengthen not only the country's physical defenses but also its digital protection.

In July 2016, russia enforced Federal bills No. 374-FZ and 375-FZ, which require telecom providers to store the content of voice calls, data, images, and text messages on russian servers for 6 months, and their metadata (e.g., time, location, message sender, and recipients) for 3 years. Online services such as messengers, emails, and social networks that use encrypted data are required to permit the Federal Security Service (FSB) to access and read their encrypted communications.

It means that all internet and telecom companies that have some presence in russia are obliged to disclose these communications and metadata and "all other information necessary" to authorities on request without a court order. With the above considerations, we need a way to discover potentially unwanted software with connections to the aggressor for further decision-making regarding its removal. We will use static analysis in this research to examine macOS application bundles.

Inspecting the bundle contents

Software development and quality assurance teams use static analysis in various software engineering tasks. Static analysis is a computer program debugging method that examines the code and supplementary resources without executing the program. The process provides an understanding of the code structure and can help ensure that the code adheres to industry standards. Dedicated automated tools will scan all code in a project to check for vulnerabilities while validating the code.

We are primarily interested in discovering territorial affiliation information, so we experiment with several heuristics for code inspection that allow us to extract relevant pieces of data.

Bundle identifiers

In the Apple ecosystem, a bundle ID or bundle identifier uniquely identifies an application, and no two applications can have the same bundle identifier. To avoid identifier conflicts, Apple encourages developers to use reverse domain name notation to choose an application's bundle identifier.

The bundle identifier could contain specific country detection information, such as country ISO identifiers as parts of the reverse domain name: ru.keepcoder.Telegram

Using a static database as a deny list

One basic and most common way to affiliate an application with its origin country is to have a static deny list, where the bundle ID can identify individual applications.

Example of a deny list database entry:

{
  "webLink" : "https:\/\/www.themoscowtimes.com\/2021\/03\/24\/telegram-raises-1bln-with-russian-direct-investment-fund-buying-bonds-a73346",
  "reason" : "Probable ties to the Russian Federation",
  "bundleIdSubstring" : "telegram",
  "humanReadableDescription" : "Caution: Telegram doesn't verify or fact-check information shared by users or media in groups; bad actors can use this to spread disinformation. Probable ties to the Russian Federation, according to media publications."
}

In the example above, we use the "telegram" substring to detect the Telegram application family.

Code signing information

As described in the Apple Code Signing Guide:

Code signing is a macOS security technology that you use to certify that you created an app. Once an app is signed, the system can detect any change to the app — whether the change is introduced accidentally or by malicious code. You participate in code signing as a developer when you obtain a signing identity and apply your signature to the apps you ship. A certificate authority (often Apple) vouches for your signing identity.

Bundle's code signature contains various security information along with the certificates that were used with the signature. Code signing information is available by Security.framework:

A SecCertificateRef object for a certificate that is stored in a keychain can be safely cast to a SecKeychainItemRef for manipulation as a keychain item. On the other hand, if the SecCertificateRef is not stored in a keychain, casting the object to a SecKeychainItemRef and passing it to Keychain Services functions returns errors.

var codeRef: SecStaticCode?
guard let appURL = CFURLCreateWithFileSystemPath(kCFAllocatorDefault, (path as NSString), .cfurlposixPathStyle, true) else {
    return
}
SecStaticCodeCreateWithPath(appURL, .init(rawValue: 0), &codeRef)
guard let codeRef = codeRef else {
    return
}
var signingDictionary: CFDictionary?
SecCodeCopySigningInformation(codeRef, SecCSFlags(rawValue: kSecCSSigningInformation), &signingDictionary)

However, certificate information is not available publicly, only through a private API:

extern CFArrayRef SecCertificateCopyCountry(SecCertificateRef certificate);

@implementation CertificateOriginProvider

- (NSArray *)certificateOriginsFromCertificates:(SecCertificateRef)certificate {
    if (&SecCertificateCopyCountry == NULL) {
        return @[];
    }   
    return (__bridge NSArray *)(SecCertificateCopyCountry(certificate));
}

@end

Information property list file

Bundles, which represent executables of different kinds, contain an information property list file. This collection of key-value pairs specifies how the system should interpret the associated bundle. Some key-value pairs characterize the bundle itself, while others configure the app, the framework, or another entity the bundle represents. Some keys are required, while others are specific to particular features of the executable. The point of interest here are two keys: SUFeedURL and NSHumanReadableCopyright.

SUFeedURL is the key that comes from a 3rd party open-source Sparkle framework. It represents the URL where the appcast feed is hosted. The approach described in AppStore metadata section can be applied here.

NSHumanReadableCopyright contains the copyright text shown in the "About" window. It often represents the developer's website or company name.

Bundle localization

Localization is a common practice of making the application accessible across the world. macOS applications can be localized by including strings representations of translations that are applied according to the device location.

AppKit has native APIs to retrieve the bundle's localizations list.

let bundle = Bundle(url: url)
let localizations = bundle.localizations

Mach-O files

Mach-O is the native executable format of binaries in OS X and is the preferred format for shipping code. It determines the order in which the code and data in a binary file are read into memory. The ordering of code and data has implications for memory usage and paging activity and thus directly affects the performance of a program.

A Mach-O binary is organized into segments. Each segment contains one or more sections. Code or data of different types go into each section. Segments always start on a page boundary, but sections are not necessarily page-aligned. The size of a segment is measured by the number of bytes in all the sections it contains and rounded up to the next virtual memory page boundary. Thus, a segment is always a multiple of 4096 bytes, or 4 kilobytes, with 4096 bytes being the minimum size.

The __cstring section contains literal string constants (quoted strings in source code), so they are definitely subjects for investigation for stored URLs in the binary. In macOS there are built-in tools to parse binary files and search for URLs therein.

$ strings /Applications/Setapp/CleanMyMac\ X.app/Contents/MacOS/CleanMyMac | grep "http:"
http://www.homesweeklies.com/homepage
http://www.weknow.ac
http://www.google.
http://www.apple.
http://www.news.com

Using the MachO-Kit open-source 3rd party library:

func parseBinaryURLs(applicationURL: URL) -> [String] {
    guard let bundle = Bundle(url: applicationURL), let executableURL = bundle.executableURL else {
        return []
    }
    do {
        let memoryMap = try MKMemoryMap(contentsOfFile: executableURL)
        let offset: UInt32 = {
            guard let fatBinary = try? MKFatBinary(memoryMap: memoryMap),
                    let slice64 = fatBinary.architectures.first(where: { $0.cputype == CPU_TYPE_X86_64 }) else {
                return 0
            }
            return slice64.offset
        }()
        let macho = try MKMachOImage(name: "MKMemoryMap", flags: .init(rawValue: 0), atAddress: mk_vm_address_t(offset), inMapping: memoryMap)
        guard let strings = macho.sections.first(where: { $0.value.name == "__cstring" }), let stringSection = strings.value as? MKCStringSection else {
            return []
        }
        let siteURLs = stringSection
            .strings
            .compactMap { $0.string?.lowercased() }
            .filter(stringsFilter)
        return siteURLs
    } catch {
        return []
    }
}

func stringsFilter(_ string: String) -> Bool {
    let searchedStrings = [".ru", "vk.com", "yandex.com"]
    return (string.hasPrefix("http") || string.contains("yandex")) && searchedStrings.contains(where: { string.lowercased().contains($0) })
}

Augmenting analysis with external information

AppStore metadata

For every application distributed through the AppStore, we can obtain its metadata by using AppStore Search API:

The Search API allows you to place search fields in your website to search for content within the iTunes Store, App Store, iBooks Store and Mac App Store. You can search for a variety of content; including apps, iBooks, movies, podcasts, music, music videos, audiobooks, and TV shows. You can also call an ID-based lookup request to create mappings between your content library and the digital catalog.

A Search API request example:

curl "https://itunes.apple.com/lookup?entity=software&bundleId=com.macpaw.CleanMyMac-mas" \
     -H 'Content-Type: multipart/form-data; charset=utf-8; boundary=__X_PAW_BOUNDARY__'

Search API response:

"kind": "mac-software",
"minimumOsVersion": "10.10",
"trackCensoredName": "CleanMyMac X",
"languageCodesISO2A": [
  "NL",
  "UK"
],
"fileSizeBytes": "88072777",
"sellerUrl": "https://macpaw.com",
"formattedPrice": "Free",
"contentAdvisoryRating": "4+",
"averageUserRatingForCurrentVersion": 0,
"userRatingCountForCurrentVersion": 0,
"averageUserRating": 0,
"trackContentRating": "4+",
"bundleId": "com.macpaw.CleanMyMac-mas",
"releaseDate": "2020-04-28T07:00:00Z",
"trackName": "CleanMyMac X",
"primaryGenreName": "Utilities",
"isVppDeviceBasedLicensingEnabled": true,
"sellerName": "MacPaw Inc.",
"currentVersionReleaseDate": "2022-05-24T09:22:29Z",
"releaseNotes": "Improved:\nEnhanced our malware detection system to make sure your Mac is protected at all times.\n\nFixed:\nMinor issues and known bugs.",
"primaryGenreId": 6002,
"currency": "USD",
"description": "Delete megatons of junk, malware, and make your Mac faster & more organized.\n\nCleanMyMac X packs 30+ tools to help you solve the most common Mac issues. You can use it to manage storage, apps, and monitor the health of your computer. There are even personalized cleanup tips based on how you use your Mac.\n\n\nKEY FEATURES\n\nFree up space\n\nDelete gigabytes of system junk, broken data, and caches.\nFind large and old files scattered across all folders.\nVisualize your storage and find your largest space-wasters.\n\nProtect your Mac\n\nScan your Mac for the latest viruses and adware.\nDelete malware agents like keyloggers, spyware, etc.\nClear out browsing history and tracking cookies.\n\nUninstall apps\n\nFind and delete unwanted apps completely.\nReset broken apps to their default state.\nRemove extensions and background plugins.\n\nMonitor Mac's health\n\nSee real-time data about battery and processor load.\nMonitor network speed and available memory.\nGet personalized Mac cleanup tips.\n\n\nAWARD-WINNING DESIGN\n\nWinner of iF Design Award 2020\nProduct Hunt "App of the month”\nMacStories "Must-Have Mac App” 2019\n\nCleanMyMac X turns the not so exciting task of cleaning your computer into a stylish and interactive ride. It places simplicity at the core of its design. With smart and self-learning algorithms under the hood, the app stays incredibly easy to use. \n\n\nWHAT MAC EXPERTS SAY \n\n"If you've found yourself struggling with a nearly full Mac, check out CleanMyMac X. The app has been an excellent way to recover space with minimal effort for many years and I expect it will continue to be so for many more.”\n\nMacStories\n\n"CleanMyMac X makes it easy to maintain a healthy Mac. Its built-in tools make it easy to rid your machine of unwanted apps and files, protect it against malware, and more.”\n\nCult of Mac\n\n"Users will appreciate CleanMyMac X's streamlined, attractive interface, which includes clear icons and gentle animations to make the scrubbing process pleasant.”\n\nVentureBeat\n\n"From insane speed improvements to malware removal, a new menu design, and more, this release is packed with new features that you are going to want to check out.”\n\niMore\n\n\nSUBSCRIPTION AND PRICING\n \nSome features are only partially available for non-paying users and require an in-app purchase. For example, the non-paying users can clean 500 MB of junk across all modules and up to 1 GB of junk in the Space Lens module.\n \nSee the pricing details in the Information section under In-App Purchases.\n \nHave questions? We are always here to help. Please message our support using the contact below.\nhttps://macpaw.com/support/contact\n\nTerms of Service https://macpaw.com/cleanmymac-x-terms-of-service-mas\nPrivacy Policy https://macpaw.com/cleanmymac-x-privacy-policy \n \nMac is a trademark of Apple Inc.\niTunes is a trademark of Apple Inc.",
"artistId": 403752295,
"artistName": "MacPaw Inc.",
"price": 0.00,
"version": "4.10.6",
"wrapperType": "software",
"userRatingCount": 0

Some of the fields are relevant for our investigation.

Fields sellerName and artistName can be checked for well-known country-affiliated developers or developer websites written there.

The field sellerURL can be checked for a match with an affiliated country domain name. For example, if the country of interest is russia, we can check if the "ru" subdomain appears in the full domain name.

Another important part of the sellerURL investigation is GeoIP information. Each domain corresponds to a particular IP address which points to some physical server. To get the country of the server location, we can use Geo IP services. A built-in service in macOS is whois.

A query can be used to obtain an IP address from the domain name:

$ dig macpaw.com
;; ANSWER SECTION:
macpaw.com.		163	IN	A	104.18.31.100

$ whois 104.18.30.100
% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object

OrgName:        Cloudflare, Inc.
OrgId:          CLOUD14
Address:        101 Townsend Street
City:           San Francisco
StateProv:      CA
PostalCode:     94107
Country:        US
RegDate:        2010-07-09
Updated:        2021-07-01
Ref:            https://rdap.arin.net/registry/entity/CLOUD14

Same functionality is available from getnameinfo POSIX function and CoreFoundation CFHostStartInfoResolution:

class DNSResolver {
    
    func resolve(domain: String) -> String? {
        let host = CFHostCreateWithName(nil, domain as CFString).takeRetainedValue()
        CFHostStartInfoResolution(host, .addresses, nil)
        var success: DarwinBoolean = false
        if let addresses = CFHostGetAddressing(host, &success)?.takeUnretainedValue() as NSArray?,
            let theAddress = addresses.firstObject as? NSData {
            var hostname = [CChar](repeating: 0, count: Int(NI_MAXHOST))
            if getnameinfo(theAddress.bytes.assumingMemoryBound(to: sockaddr.self), socklen_t(theAddress.length),
                           &hostname, socklen_t(hostname.count), nil, 0, NI_NUMERICHOST) == 0 {
                let numAddress = String(cString: hostname)
                return numAddress
            }
        }
        return nil
    }
}

Another important key in the Search API response is languageCodesISO2A. It represents the languages of the AppStore page localizations.

"languageCodesISO2A": [
  "NL",
  "UK"
],

App title and description

Another part of bundle analysis is to detect the application's AppStore title and description language with Natural Language Processor. On macOS and iOS, this is possible with Natural Language framework:

The Natural Language framework provides a variety of natural language processing (NLP) functionality with support for many different languages and scripts. Use this framework to segment natural language text into paragraphs, sentences, or words, and tag information about those segments, such as part of speech, lexical class, lemma, script, and language.

Language definition is available in NLLanguageRecognizer:

let languageRecognizer = NLLanguageRecognizer()

languageRecognizer.processString(description)
if languageRecognizer.dominantLanguage == .russian {
    //Description is written in Russian
}
languageRecognizer.reset()
languageRecognizer.processString(title)
if languageRecognizer.dominantLanguage == .russian {
    //Title is written in Russian
}

Conclusion

Static analysis of macOS application bundles is a straightforward and powerful approach for getting country affiliation of the software. This is a well-extensible mechanism that could be improved and re-considered from time to time and could be a basis for security-related software like SpyBuster.

SpyBuster Static Analysis Screen
SpyBuster Static Analysis Screen
SpyBuster Detection Threshold Preferences Screen
SpyBuster Detection Threshold Preferences Screen

Related publications