Пакет: file

Версия: 5.46
Релиз: 1.niceos5
Архитектура: x86_64
Хэш GOST: c5037e54a818820b6d55e7a13e40e6ba96fe6287b63980af1de48b48110451ea
Хэш MD5: 41da4ca272c90c4319f2a168717355a3
Хэш SHA256: 503d6a8eb6e166a5a5bc0093ca38613b8b939c4afdf6b2e2a1f9a7ee3f345dae
Лицензия: BSD
Дата сборки: 12 мая 2025 г.
Размер: 51,899 МиБ
Совместимые ОС
rpm файл:: file-5.46-1.niceos5.x86_64.rpm

Подпакеты

Имя	Краткое описание
file-libs	Описание отсутствует
file-devel	Описание отсутствует
lib32-file	32-битные библиотеки для file

Зависимости

Имя	Тип	Версия
file-libs	runtime	-
libbz2.so.1.0()(64bit)	runtime	-
libc.so.6()(64bit)	runtime	-
libc.so.6(GLIBC_2.2.5)(64bit)	runtime	-
libc.so.6(GLIBC_2.34)(64bit)	runtime	-
libc.so.6(GLIBC_2.38)(64bit)	runtime	-
libc.so.6(GLIBC_2.4)(64bit)	runtime	-
liblzma.so.5()(64bit)	runtime	-
libm.so.6()(64bit)	runtime	-
libmagic.so.1()(64bit)	runtime	-
libseccomp.so.2()(64bit)	runtime	-
libz.so.1()(64bit)	runtime	-
libzstd.so.1()(64bit)	runtime	-
rtld(GNU_HASH)	runtime	-

Граф зависимостей

История изменений

Дата	Автор	Сообщение
31 мар. 2025 г.	Stanislav Belikov <sbelikov@ncsgp.ru>	Первая сборка для file

Файлы пакета

/usr/bin/file 35,148 КиБ

/usr/share/man/man1/file.1.gz 8,659 КиБ

/usr/share/man/man4/magic.4.gz 8,092 КиБ

Документация (man-страницы)

FILE(1)			  BSD Общие команды			FILE(1)

NAME
     file — определить тип файла

SYNOPSIS
     file [-bcdEhiklLNnprsSvzZ0] [--apple] [--exclude-quiet] [--extension]
	  [--mime-encoding] [--mime-type] [-e testname] [-F separator]
	  [-f namefile] [-m magicfiles] [-P name=value] file ...
     file -C [-m magicfiles]
     file [--help]

DESCRIPTION
     Это руководство описывает версию 5.46 команды file.

     Команда file тестирует каждый аргумент в попытке классифицировать его.
     Выполняется три набора тестов, в следующем порядке: тесты файловой
     системы, тесты magic и тесты языка. Первый успешный тест приводит к
     выводу типа файла.

     Выводимый тип обычно содержит одно из слов text (файл содержит только
     печатные символы и несколько общих управляющих символов и, вероятно,
     безопасен для чтения на терминале ASCII), executable (файл содержит
     результат компиляции программы в форме, понятной какому-то ядру UNIX или
     другому), или data, что означает всё остальное (data обычно является
     «бинарным» или непечатаемым). Исключения — известные форматы файлов (файлы
     core, архивы tar), которые известны как содержащие бинарные данные. При
     изменении файлов magic или самой программы убедитесь, что эти ключевые
     слова сохранены. Пользователи зависят от того, что все читаемые файлы в
     каталоге имеют слово «text». Не делайте, как в Berkeley, и не изменяйте
     «shell commands text» на «shell script».

     Тесты файловой системы основаны на анализе возвращаемого значения системного
     вызова stat(2). Программа проверяет, пуст ли файл, или это какой-то
     специальный файл. Любые известные типы файлов, подходящие для системы,
     на которой вы работаете (сокеты, символические ссылки или именованные
     каналы (FIFOs) на системах, которые их реализуют), определяются, если они
     определены в системном заголовочном файле <sys/stat.h>.

     Тесты magic используются для проверки файлов с данными в определённых
     фиксированных форматах. Классический пример — бинарный исполняемый (скомпилированный
     программа) файл a.out, формат которого определён в <elf.h>, <a.out.h> и,
     возможно, <exec.h> в стандартном каталоге include. Эти файлы имеют
     «magic number» (магическое число), хранящееся в определённом месте
     в начале файла, которое сообщает операционной системе UNIX, что файл
     является бинарным исполняемым и к какому из нескольких типов он
     относится. Концепция «magic number» была расширена на файлы данных.
     Любой файл с инвариантным идентификатором в небольшом фиксированном смещении
     от начала файла обычно может быть описан таким образом. Информация,
     идентифицирующая эти файлы, читается из скомпилированного файла magic
     /usr/share/misc/magic.mgc или из файлов в каталоге /usr/share/misc/magic,
     если скомпилированный файл не существует. Кроме того, если $HOME/.magic.mgc
     или $HOME/.magic существует, он будет использоваться в предпочтении
     системным файлам magic.

     Если файл не соответствует ни одной из записей в файле magic, он проверяется
     на предмет того, является ли он текстовым файлом. Могут быть
     различены наборы символов ASCII, ISO-8859-x, не-ISO 8-битные расширенные
     наборы ASCII (такие, как те, что используются в системах Macintosh и IBM PC),
     UTF-8-кодированный Unicode, UTF-16-кодированный Unicode и EBCDIC.
     Различные наборы символов различаются по диапазонам и последовательностям
     байтов, которые составляют печатный текст в каждом наборе. Если файл
     проходит любой из этих тестов, его набор символов сообщается. Файлы
     ASCII, ISO-8859-x, UTF-8 и расширенные ASCII идентифицируются как «text»,
     потому что они в основном читаемы на любом терминале; UTF-16 и EBCDIC
     являются только «character data», потому что, хотя они содержат текст,
     это текст, который потребует перевода перед чтением. Кроме того,
     file попытается определить другие характеристики файлов типа text.
     Если строки файла заканчиваются CR, CRLF или NEL, вместо стандартного
     для UNIX LF, это будет сообщено. Файлы, содержащие встроенные последовательности
     escape или перекрытия, также будут идентифицированы.

     После того, как file определит набор символов, используемый в файле типа text,
     он попытается определить, на каком языке написан файл. Тесты языка ищут
     определённые строки (смотрите <names.h>), которые могут появляться
     где угодно в первых нескольких блоках файла. Например, ключевое слово .br
     указывает, что файл, вероятно, является входным файлом для troff(1), как
     ключевое слово struct указывает на программу на C. Эти тесты менее
     надёжны, чем предыдущие две группы, поэтому они выполняются в последнюю
     очередь. Подпрограммы тестов языка также проверяют некоторые
     дополнительные элементы (такие как архивы tar(1), файлы JSON).

     Любые файлы, которые не могут быть идентифицированы как написанные в
     любом из перечисленных выше наборов символов, просто называются «data».

OPTIONS
     --apple
	     Заставляет команду file выводить тип файла и код создателя,
	     как это делалось в старых версиях MacOS. Код состоит из восьми
	     букв, первая описывает тип файла, остальные — создателя.
	     Эта опция работает правильно только для форматов файлов,
	     для которых определён вывод в стиле apple.

     -b, --brief
	     Не добавлять имена файлов в начало строк вывода (краткий режим).

     -C, --compile
	     Записать выходной файл magic.mgc, содержащий предразобранную
	     версию файла magic или каталога.

     -c, --checking-printout
	     Вызвать проверочный вывод разобраной формы файла magic.
	     Это обычно используется вместе с опцией -m для отладки
	     нового файла magic перед его установкой.

     -d     Выводить внутреннюю отладочную информацию в stderr.

     -E     При ошибках файловой системы (файл не найден и т.п.), вместо
	     обработки ошибки как обычного вывода, как требует POSIX, и
	     продолжения работы, выдавать сообщение об ошибке и выйти.

     -e, --exclude testname
	     Исключить тест с именем testname из списка тестов для
	     определения типа файла. Допустимые имена тестов:

	     apptype   Тип приложения EMX (только на EMX).

	     ascii     Различные типы текстовых файлов (этот тест попытается
		       угадать кодировку текста, независимо от установки
		       опции ‘encoding’).

	     encoding  Разные кодировки текста для тестов soft magic.

	     tokens    Игнорируется для обратной совместимости.

	     cdf       Выводит детали Compound Document Files.

	     compress  Проверяет и смотрит внутри сжатых файлов.

	     csv       Проверяет файлы Comma Separated Value.

	     elf       Выводит детали ELF-файлов, если включены тесты soft magic
		       и найдено magic ELF.

	     json      Изучает файлы JSON (RFC-7159), анализируя их на соответствие.

	     soft      Обращается к файлам magic.

	     simh      Изучает файлы лент SIMH.

	     tar       Изучает файлы tar, проверяя контрольную сумму 512-байтного
		       заголовка tar. Исключение этого теста может предоставить
		       более детальное описание содержимого с помощью метода
		       soft magic.

	     text      Синоним для ‘ascii’.

     --exclude-quiet
	     Как --exclude, но игнорировать тесты, о которых file не знает.
	     Это предназначено для совместимости со старыми версиями file.

     --extension
	     Выводить разделённый слэшами список допустимых расширений для
	     найденного типа файла.

     -F, --separator separator
	     Использовать указанную строку в качестве разделителя между
	     именем файла и результатом файла. По умолчанию — ‘:’.

     -f, --files-from namefile
	     Читать имена файлов для анализа из namefile (по одному на
	     строку) перед списком аргументов. Должен присутствовать либо
	     namefile, либо хотя бы один аргумент с именем файла; для
	     тестирования стандартного ввода используйте ‘-’ в качестве
	     аргумента имени файла. Обратите внимание, что namefile
	     разворачивается и имена файлов внутри обрабатываются при
	     обнаружении этой опции и до дальнейшей обработки опций.
	     Это позволяет обрабатывать несколько списков файлов с
	     разными аргументами командной строки в одном вызове file.
	     Таким образом, если вы хотите установить разделитель, сделайте
	     это перед указанием списка файлов, например: «-F @ -f namefile»,
	     а не «-f namefile -F @».

     -h, --no-dereference
	     Эта опция заставляет не следовать символическим ссылкам (на
	     системах, которые поддерживают символические ссылки). Это
	     значение по умолчанию, если переменная окружения POSIXLY_CORRECT
	     не определена.

     -i, --mime
	     Заставляет команду file выводить строки типов MIME вместо
	     традиционных читаемых человеком. Таким образом, она может
	     выводить ‘text/plain; charset=us-ascii’ вместо «ASCII text».

     --mime-type, --mime-encoding
	     Как -i, но выводить только указанный(ые) элемент(ы).

     -k, --keep-going
	     Не останавливаться на первом совпадении, продолжать. Последующие
	     совпадения будут иметь префикс строки ‘\012- ’. (Если вы хотите
	     новую строку, смотрите опцию -r.) Сначала идёт шаблон magic с
	     наибольшей силой (смотрите опцию -l).

     -l, --list
	     Показывает список шаблонов и их силы, отсортированный по убыванию
	     силы magic(4), которая используется для сопоставления (смотрите
	     также опцию -k).

     -L, --dereference
	     Эта опция заставляет следовать символическим ссылкам, как
	     одноимённая опция в ls(1) (на системах, которые поддерживают
	     символические ссылки). Это значение по умолчанию, если
	     переменная окружения POSIXLY_CORRECT определена.

     -m, --magic-file magicfiles
	     Указать альтернативный список файлов и каталогов, содержащих
	     magic. Это может быть один элемент или список, разделённый
	     двоеточиями. Если скомпилированный файл magic найден рядом с
	     файлом или каталогом, он будет использоваться вместо него.

     -N, --no-pad
	     Не выравнивать имена файлов в выводе.

     -n, --no-buffer
	     Заставляет stdout сбрасываться после проверки каждого файла.
	     Это полезно только при проверке списка файлов. Предназначено
	     для использования программами, которые хотят получить вывод
	     о типе файла из конвейера.

     -p, --preserve-date
	     На системах, которые поддерживают utime(3) или utimes(2),
	     попытаться сохранить время доступа анализируемых файлов,
	     чтобы притвориться, что file никогда их не читал.

     -P, --parameter name=value
	     Установить различные лимиты параметров.

	     Name	  Default    Explanation
	     bytes	  1M	     максимальное количество байтов для чтения из файла
	     elf_notes	  256	     максимальное количество обработанных заметок ELF
	     elf_phnum	  2K	     максимальное количество обработанных разделов ELF program
	     elf_shnum	  32K	     максимальное количество обработанных разделов ELF
	     elf_shsize	  128MB	     максимальный размер обработанного раздела ELF
	     encoding	  65K	     максимальное количество байтов для определения кодировки
	     indir	  50	     лимит рекурсии для косвенного magic
	     name	  100	     лимит использования для name/use magic
	     regex	  8K	     лимит длины для поисков по регулярным выражениям

     -r, --raw
	     Не переводить непечатаемые символы в \ooo. Обычно file
	     переводит непечатаемые символы в их восьмеричное представление.

     -s, --special-files
	     Обычно file пытается читать и определять тип только тех
	     аргументных файлов, которые stat(2) определяет как обычные
	     файлы. Это предотвращает проблемы, потому что чтение специальных
	     файлов может иметь странные последствия. Указание опции -s
	     заставляет file также читать аргументные файлы, которые являются
	     блочными или символьными специальными файлами. Это полезно
	     для определения типов файловых систем данных в необработанных
	     разделах диска, которые являются блочными специальными файлами.
	     Эта опция также заставляет file игнорировать размер файла,
	     как сообщено stat(2), поскольку на некоторых системах он
	     сообщает размер 0 для необработанных разделов диска.

     -S, --no-sandbox
	     На системах, где доступна libseccomp
	     (https://github.com/seccomp/libseccomp), опция -S отключает
	     песочницу, которая включена по умолчанию. Эта опция необходима
	     для того, чтобы file мог выполнять внешние программы
	     распаковки, т.е. когда указана опция -z и встроенные
	     распаковщики недоступны. На системах, где песочница недоступна,
	     эта опция не имеет эффекта.

     -v, --version
	     Вывести версию программы и выйти.

     -z, --uncompress
	     Попробовать посмотреть внутри сжатых файлов.

     -Z, --uncompress-noreport
	     Попробовать посмотреть внутри сжатых файлов, но сообщать
	     информацию только о содержимом, а не о сжатии.

     -0, --print0
	     Выводить нулевой символ ‘\0’ после имени файла. Удобно для
	     cut(1). Это не влияет на разделитель, который всё равно выводится.

	     Если эта опция повторяется более одного раза, file выводит
	     только имя файла, за которым следует NUL, за которым следует
	     описание (или текст ERROR) и второй NUL для каждой записи.

     --help  Вывести сообщение помощи и выйти.

ENVIRONMENT
     Переменная окружения MAGIC может использоваться для установки имени
     файла magic по умолчанию. Если эта переменная установлена, file не
     попытается открыть $HOME/.magic. file добавляет «.mgc» к значению
     этой переменной, где это уместно. Переменная окружения POSIXLY_CORRECT
     управляет (на системах, которые поддерживают символические ссылки),
     будет ли file следовать символическим ссылкам или нет. Если установлена,
     то file следует символическим ссылкам, в противном случае — нет.
     Это также управляется опциями -L и -h.

FILES
     /usr/share/misc/magic.mgc	Скомпилированный список magic по умолчанию.
     /usr/share/misc/magic	Каталог, содержащий файлы magic по умолчанию.

EXIT STATUS
     file выйдет с кодом 0, если операция прошла успешно, или >0, если
     возникла ошибка. Следующие ошибки вызывают диагностические сообщения,
     но не влияют на код выхода программы (как требует POSIX), если не
     указана -E:
	   •   Файл не найден
	   •   Нет разрешения на чтение файла
	   •   Тип файла не может быть определён

EXAMPLES
	   $ file file.c file /dev/{wd0a,hda}
	   file.c:   C program text
	   file:     ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
		     dynamically linked (uses shared libs), stripped
	   /dev/wd0a: block special (0/0)
	   /dev/hda: block special (3/0)

	   $ file -s /dev/wd0{b,d}
	   /dev/wd0b: data
	   /dev/wd0d: x86 boot sector

	   $ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
	   /dev/hda:   x86 boot sector
	   /dev/hda1:  Linux/i386 ext2 filesystem
	   /dev/hda2:  x86 boot sector
	   /dev/hda3:  x86 boot sector, extended partition table
	   /dev/hda4:  Linux/i386 ext2 filesystem
	   /dev/hda5:  Linux/i386 swap file
	   /dev/hda6:  Linux/i386 swap file
	   /dev/hda7:  Linux/i386 swap file
	   /dev/hda8:  Linux/i386 swap file
	   /dev/hda9:  empty
	   /dev/hda10: empty

	   $ file -i file.c file /dev/{wd0a,hda}
	   file.c:	text/x-c
	   file:	application/x-executable
	   /dev/hda:	application/x-not-regular-file
	   /dev/wd0a:	application/x-not-regular-file

SEE ALSO
     hexdump(1), od(1), strings(1), magic(4)

STANDARDS CONFORMANCE
     Эта программа, по-видимому, превышает Определение интерфейса System V
     для FILE(CMD), насколько можно судить по расплывчатому языку там.
     Её поведение в основном совместимо с программой System V с тем же
     именем. Однако эта версия знает больше magic, поэтому в многих случаях
     она будет производить другой (хотя и более точный) вывод.

     Единственное значительное отличие этой версии от System V заключается в
     том, что эта версия рассматривает любой пробел как разделитель, поэтому
     пробелы в строках шаблонов должны быть экранированы. Например,

	   >10	   string  language impress	   (imPRESS data)

     в существующем файле magic нужно изменить на

	   >10	   string  language\ impress	   (imPRESS data)

     Кроме того, в этой версии, если строка шаблона содержит обратную
     косую черту, она должна быть экранирована. Например

	   0	   string	   \begindata	   Andrew Toolkit document

     в существующем файле magic нужно изменить на

	   0	   string	   \\begindata	   Andrew Toolkit document

     Выпуски SunOS 3.2 и позже от Sun Microsystems включают команду file,
     полученную от версии System V, но с некоторыми расширениями. Эта версия
     отличается от Sun's только в незначительных аспектах. Она включает
     расширение оператора ‘&’, используемого, например,

	   >16	   long&0x7fffffff >0		   not stripped

SECURITY
     На системах, где доступна libseccomp (https://github.com/seccomp/libseccomp),
     file enforces ограничение системных вызовов только теми, которые необходимы
     для работы программы. Это ограничение не предоставляет никакой пользы
     для безопасности, когда file запрашивает распаковку входных файлов с
     выполнением внешних программ с опцией -z. Чтобы включить выполнение
     внешних распаковщиков, нужно отключить песочницу с помощью опции -S.

MAGIC DIRECTORY
     Записи в файле magic были собраны из различных источников, в основном
     USENET, и предоставлены различными авторами. Christos Zoulas (адрес
     ниже) будет собирать дополнительные или исправленные записи файла magic.
     Консолидация записей файла magic будет распространяться периодически.

     Порядок записей в файле magic имеет значение. В зависимости от системы,
     которую вы используете, порядок, в котором они собраны, может быть
     неверным. Если ваша старая команда file использует файл magic, оставьте
     старый файл magic для сравнения (переименуйте его в
     /usr/share/misc/magic.orig).

HISTORY
     Команда file существовала в каждом UNIX по крайней мере с версии Research
     4 (страница man датирована ноябрём 1973 года). Версия System V ввела одно
     значительное изменение: внешний список типов magic. Это немного замедлило
     программу, но сделало её гораздо более гибкой.

     Эта программа, основанная на версии System V, была написана Ian Darwin
     ⟨ian@darwinsys.com⟩ без просмотра исходного кода других.

     John Gilmore значительно пересмотрел код, сделав его лучше, чем первая
     версия. Geoff Collyer обнаружил несколько недостатков и предоставил
     некоторые записи файла magic. Вклад оператора ‘&’ от Rob McMahon,
     ⟨cudcv@warwick.ac.uk⟩, 1989.

     Guy Harris, ⟨guy@netapp.com⟩, внес много изменений с 1993 года по настоящее
     время.

     Основная разработка и обслуживание с 1990 года по настоящее время от
     Christos Zoulas ⟨christos@astron.com⟩.

     Изменено Chris Lowth ⟨chris@lowth.com⟩, 2000: обработка опции -i для
     вывода строк типов MIME с использованием альтернативного файла magic и
     внутренней логики.

     Изменено Eric Fischer ⟨enf@pobox.com⟩, июль 2000, для идентификации
     кодов символов и попытки идентификации языков не-ASCII файлов.

     Изменено Reuben Thomas ⟨rrt@sc3d.org⟩, 2007-2011, для улучшения поддержки
     MIME, объединения magic MIME и не-MIME, поддержки каталогов, а также
     файлов magic, применения многих исправлений ошибок, обновления и
     исправления множества magic, улучшения системы сборки, улучшения
     документации и переписывания привязок Python на чистом Python.

     Список вкладчиков в каталог ‘magic’ (файлы magic) слишком длинный,
     чтобы перечислить здесь. Вы знаете, кто вы; спасибо. Многие вкладчики
     перечислены в исходных файлах.

LEGAL NOTICE
     Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999. Под действие
     стандартного авторского права Berkeley Software Distribution; смотрите
     файл COPYING в исходной дистрибуции.

     Файлы tar.h и is_tar.c были написаны John Gilmore из его общедоступной
     программы tar(1) и не подпадают под вышеуказанную лицензию.

BUGS
     Пожалуйста, сообщайте об ошибках и присылайте патчи в трекер ошибок по
     адресу https://bugs.astron.com/ или в список рассылки ⟨file@astron.com⟩
     (сначала посетите https://mailman.astron.com/mailman/listinfo/file,
     чтобы подписаться).

TODO
     Исправить вывод, чтобы тесты на флаги MIME и APPLE не требовались
     повсюду, и фактический вывод выполнялся только в одном месте. Для этого
     нужен дизайн. Предложение: помещать возможные выходы в список, затем
     выбирать последнее добавленное (наиболее конкретное, надеемся) значение
     в конце или использовать значение по умолчанию, если список пуст. Это
     не должно замедлять оценку.

     Обработка MAGIC_CONTINUE и печать \012- между записями неуклюжа и
     усложнена; рефакторить и централизовать.

     Часть логики кодировки закодирована в encoding.c и может быть перемещена
     в файлы magic, если бы у нас была аннотация !:charset.

     Продолжить устранение всех ошибок magic. Смотрите Debian BTS для хорошего
     источника.

     Хранить строки произвольной длины, например для шаблонов %s, чтобы их
     можно было выводить. Исправляет ошибку Debian #271672. Это можно сделать,
     выделяя строки в пуле строк, храня пулы строк в конце файла magic и
     преобразуя все указатели на строки в относительные смещения от пула строк.

     Добавить синтаксис для относительных смещений после текущего уровня
     (ошибка Debian #466037).

     Сделать file -ki рабочей, т.е. выдавать несколько типов MIME.

     Добавить библиотеку zip, чтобы заглядывать внутрь документов Office2007
     и выводить больше деталей о их содержимом.

     Добавить опцию для вывода URL источников описаний файлов.

     Объединить поиски скриптов и добавить способ сопоставления имён
     исполняемых файлов с типами MIME (например, иметь значение magic для
     !:mime, которое заставляет результирующую строку искаться в таблице).
     Это позволит избежать добавления одного и того же magic повторно для
     каждого нового интерпретатора hash-bang.

     Когда доступен дескриптор файла, мы можем пропускать и корректировать
     буфер вместо хакерского управления буфером, которое мы делаем сейчас.

     Исправить «name» и «use», чтобы проверять согласованность на этапе
     компиляции (дублирующее «name», «use», указывающее на неопределённое
     «name»). Сделать «name» / «use» более эффективным, поддерживая отсортированный
     список имён. Специально обработать ^ для переключения эндианности в
     парсере, чтобы его не нужно было экранировать, и задокументировать.

     Если смещения, указанные внутри файла, превышают размер буфера (
     переменная HOWMANY в file.h), то мы не переходим к этому смещению, а
     сдаёмся. Было бы лучше, если управление буфером выполнялось, когда
     доступен дескриптор файла, чтобы мы могли искать по файлу. Однако
     нужно быть осторожным, потому что это имеет последствия для производительности
     и, следовательно, безопасности, поскольку можно замедлить всё,
     повторно ища.

     Теперь есть поддержка для хранения отдельных буферов и смещений от конца
     файла, но внутреннее управление буфером всё ещё нуждается в переработке.

AVAILABILITY
     Вы можете получить последнюю версию оригинального автора по анонимному
     FTP на ftp.astron.com в каталоге /pub/file/file-X.YZ.tar.gz.

BSD				 7 апреля 2024				   BSD

FILE(1) BSD General Commands Manual FILE(1)

NAME
file — determine file type

SYNOPSIS
file [-bcdEhiklLNnprsSvzZ0] [--apple] [--exclude-quiet] [--extension]
[--mime-encoding] [--mime-type] [-e testname] [-F separator]
[-f namefile] [-m magicfiles] [-P name=value] file ...
file -C [-m magicfiles]
file [--help]

DESCRIPTION
This manual page documents version 5.46 of the file command.

file tests each argument in an attempt to classify it. There are three
sets of tests, performed in this order: filesystem tests, magic tests,
and language tests. The first test that succeeds causes the file type to
be printed.

The type printed will usually contain one of the words text (the file
contains only printing characters and a few common control characters and
is probably safe to read on an ASCII terminal), executable (the file con‐
tains the result of compiling a program in a form understandable to some
UNIX kernel or another), or data meaning anything else (data is usually
“binary” or non-printable). Exceptions are well-known file formats (core
files, tar archives) that are known to contain binary data. When modify‐
ing magic files or the program itself, make sure to preserve these
keywords. Users depend on knowing that all the readable files in a di‐
rectory have the word “text” printed. Don't do as Berkeley did and
change “shell commands text” to “shell script”.

The filesystem tests are based on examining the return from a stat(2)
system call. The program checks to see if the file is empty, or if it's
some sort of special file. Any known file types appropriate to the sys‐
tem you are running on (sockets, symbolic links, or named pipes (FIFOs)
on those systems that implement them) are intuited if they are defined in
the system header file <sys/stat.h>.

The magic tests are used to check for files with data in particular fixed
formats. The canonical example of this is a binary executable (compiled
program) a.out file, whose format is defined in <elf.h>, <a.out.h> and
possibly <exec.h> in the standard include directory. These files have a
“magic number” stored in a particular place near the beginning of the
file that tells the UNIX operating system that the file is a binary exe‐
cutable, and which of several types thereof. The concept of a “magic
number” has been applied by extension to data files. Any file with some
invariant identifier at a small fixed offset into the file can usually be
described in this way. The information identifying these files is read
from the compiled magic file /usr/share/misc/magic.mgc, or the files in
the directory /usr/share/misc/magic if the compiled file does not exist.
In addition, if $HOME/.magic.mgc or $HOME/.magic exists, it will be used
in preference to the system magic files.

If a file does not match any of the entries in the magic file, it is ex‐
amined to see if it seems to be a text file. ASCII, ISO-8859-x, non-ISO
8-bit extended-ASCII character sets (such as those used on Macintosh and
IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded Unicode, and
EBCDIC character sets can be distinguished by the different ranges and
sequences of bytes that constitute printable text in each set. If a file
passes any of these tests, its character set is reported. ASCII,
ISO-8859-x, UTF-8, and extended-ASCII files are identified as “text” be‐
cause they will be mostly readable on nearly any terminal; UTF-16 and
EBCDIC are only “character data” because, while they contain text, it is
text that will require translation before it can be read. In addition,
file will attempt to determine other characteristics of text-type files.
If the lines of a file are terminated by CR, CRLF, or NEL, instead of the
Unix-standard LF, this will be reported. Files that contain embedded es‐
cape sequences or overstriking will also be identified.

Once file has determined the character set used in a text-type file, it
will attempt to determine in what language the file is written. The lan‐
guage tests look for particular strings (cf. <names.h>) that can appear
anywhere in the first few blocks of a file. For example, the keyword .br
indicates that the file is most likely a troff(1) input file, just as the
keyword struct indicates a C program. These tests are less reliable than
the previous two groups, so they are performed last. The language test
routines also test for some miscellany (such as tar(1) archives, JSON
files).

Any file that cannot be identified as having been written in any of the
character sets listed above is simply said to be “data”.

OPTIONS
--apple
Causes the file command to output the file type and creator code
as used by older MacOS versions. The code consists of eight let‐
ters, the first describing the file type, the latter the creator.
This option works properly only for file formats that have the
apple-style output defined.

-b, --brief
Do not prepend filenames to output lines (brief mode).

-C, --compile
Write a magic.mgc output file that contains a pre-parsed version
of the magic file or directory.

-c, --checking-printout
Cause a checking printout of the parsed form of the magic file.
This is usually used in conjunction with the -m option to debug a
new magic file before installing it.

-d Prints internal debugging information to stderr.

-E On filesystem errors (file not found etc), instead of handling
the error as regular output as POSIX mandates and keep going, is‐
sue an error message and exit.

-e, --exclude testname
Exclude the test named in testname from the list of tests made to
determine the file type. Valid test names are:

apptype EMX application type (only on EMX).

ascii Various types of text files (this test will try to
guess the text encoding, irrespective of the setting of
the ‘encoding’ option).

encoding Different text encodings for soft magic tests.

tokens Ignored for backwards compatibility.

cdf Prints details of Compound Document Files.

compress Checks for, and looks inside, compressed files.

csv Checks Comma Separated Value files.

elf Prints ELF file details, provided soft magic tests are
enabled and the elf magic is found.

json Examines JSON (RFC-7159) files by parsing them for com‐
pliance.

soft Consults magic files.

simh Examines SIMH tape files.

tar Examines tar files by verifying the checksum of the 512
byte tar header. Excluding this test can provide more
detailed content description by using the soft magic
method.

text A synonym for ‘ascii’.

--exclude-quiet
Like --exclude but ignore tests that file does not know about.
This is intended for compatibility with older versions of file.

--extension
Print a slash-separated list of valid extensions for the file
type found.

-F, --separator separator
Use the specified string as the separator between the filename
and the file result returned. Defaults to ‘:’.

-f, --files-from namefile
Read the names of the files to be examined from namefile (one per
line) before the argument list. Either namefile or at least one
filename argument must be present; to test the standard input,
use ‘-’ as a filename argument. Please note that namefile is un‐
wrapped and the enclosed filenames are processed when this option
is encountered and before any further options processing is done.
This allows one to process multiple lists of files with different
command line arguments on the same file invocation. Thus if you
want to set the delimiter, you need to do it before you specify
the list of files, like: “-F @ -f namefile”, instead of: “-f
namefile -F @”.

-h, --no-dereference
This option causes symlinks not to be followed (on systems that
support symbolic links). This is the default if the environment
variable POSIXLY_CORRECT is not defined.

-i, --mime
Causes the file command to output mime type strings rather than
the more traditional human readable ones. Thus it may say
‘text/plain; charset=us-ascii’ rather than “ASCII text”.

--mime-type, --mime-encoding
Like -i, but print only the specified element(s).

-k, --keep-going
Don't stop at the first match, keep going. Subsequent matches
will be have the string ‘\012- ’ prepended. (If you want a new‐
line, see the -r option.) The magic pattern with the highest
strength (see the -l option) comes first.

-l, --list
Shows a list of patterns and their strength sorted descending by
magic(4) strength which is used for the matching (see also the -k
option).

-L, --dereference
This option causes symlinks to be followed, as the like-named op‐
tion in ls(1) (on systems that support symbolic links). This is
the default if the environment variable POSIXLY_CORRECT is de‐
fined.

-m, --magic-file magicfiles
Specify an alternate list of files and directories containing
magic. This can be a single item, or a colon-separated list. If
a compiled magic file is found alongside a file or directory, it
will be used instead.

-N, --no-pad
Don't pad filenames so that they align in the output.

-n, --no-buffer
Force stdout to be flushed after checking each file. This is
only useful if checking a list of files. It is intended to be
used by programs that want filetype output from a pipe.

-p, --preserve-date
On systems that support utime(3) or utimes(2), attempt to pre‐
serve the access time of files analyzed, to pretend that file
never read them.

-P, --parameter name=value
Set various parameter limits.

Name Default Explanation
bytes 1M max number of bytes to read from file
elf_notes 256 max ELF notes processed
elf_phnum 2K max ELF program sections processed
elf_shnum 32K max ELF sections processed
elf_shsize 128MB max ELF section size processed
encoding 65K max number of bytes to determine encoding
indir 50 recursion limit for indirect magic
name 100 use count limit for name/use magic
regex 8K length limit for regex searches

-r, --raw
Don't translate unprintable characters to \ooo. Normally file
translates unprintable characters to their octal representation.

-s, --special-files
Normally, file only attempts to read and determine the type of
argument files which stat(2) reports are ordinary files. This
prevents problems, because reading special files may have pecu‐
liar consequences. Specifying the -s option causes file to also
read argument files which are block or character special files.
This is useful for determining the filesystem types of the data
in raw disk partitions, which are block special files. This op‐
tion also causes file to disregard the file size as reported by
stat(2) since on some systems it reports a zero size for raw disk
partitions.

-S, --no-sandbox
On systems where libseccomp
(https://github.com/seccomp/libseccomp) is available, the -S op‐
tion disables sandboxing which is enabled by default. This op‐
tion is needed for file to execute external decompressing pro‐
grams, i.e. when the -z option is specified and the built-in de‐
compressors are not available. On systems where sandboxing is
not available, this option has no effect.

-v, --version
Print the version of the program and exit.

-z, --uncompress
Try to look inside compressed files.

-Z, --uncompress-noreport
Try to look inside compressed files, but report information about
the contents only not the compression.

-0, --print0
Output a null character ‘\0’ after the end of the filename. Nice
to cut(1) the output. This does not affect the separator, which
is still printed.

If this option is repeated more than once, then file prints just
the filename followed by a NUL followed by the description (or
ERROR: text) followed by a second NUL for each entry.

--help Print a help message and exit.

ENVIRONMENT
The environment variable MAGIC can be used to set the default magic file
name. If that variable is set, then file will not attempt to open
$HOME/.magic. file adds “.mgc” to the value of this variable as appro‐
priate. The environment variable POSIXLY_CORRECT controls (on systems
that support symbolic links), whether file will attempt to follow sym‐
links or not. If set, then file follows symlink, otherwise it does not.
This is also controlled by the -L and -h options.

FILES
/usr/share/misc/magic.mgc Default compiled list of magic.
/usr/share/misc/magic Directory containing default magic files.

EXIT STATUS
file will exit with 0 if the operation was successful or >0 if an error
was encountered. The following errors cause diagnostic messages, but
don't affect the program exit code (as POSIX requires), unless -E is
specified:
• A file cannot be found
• There is no permission to read a file
• The file type cannot be determined

EXAMPLES
$ file file.c file /dev/{wd0a,hda}
file.c: C program text
file: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
dynamically linked (uses shared libs), stripped
/dev/wd0a: block special (0/0)
/dev/hda: block special (3/0)

$ file -s /dev/wd0{b,d}
/dev/wd0b: data
/dev/wd0d: x86 boot sector

$ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
/dev/hda: x86 boot sector
/dev/hda1: Linux/i386 ext2 filesystem
/dev/hda2: x86 boot sector
/dev/hda3: x86 boot sector, extended partition table
/dev/hda4: Linux/i386 ext2 filesystem
/dev/hda5: Linux/i386 swap file
/dev/hda6: Linux/i386 swap file
/dev/hda7: Linux/i386 swap file
/dev/hda8: Linux/i386 swap file
/dev/hda9: empty
/dev/hda10: empty

$ file -i file.c file /dev/{wd0a,hda}
file.c: text/x-c
file: application/x-executable
/dev/hda: application/x-not-regular-file
/dev/wd0a: application/x-not-regular-file

SEE ALSO
hexdump(1), od(1), strings(1), magic(4)

STANDARDS CONFORMANCE
This program is believed to exceed the System V Interface Definition of
FILE(CMD), as near as one can determine from the vague language contained
therein. Its behavior is mostly compatible with the System V program of
the same name. This version knows more magic, however, so it will pro‐
duce different (albeit more accurate) output in many cases.

The one significant difference between this version and System V is that
this version treats any white space as a delimiter, so that spaces in
pattern strings must be escaped. For example,

>10 string language impress (imPRESS data)

in an existing magic file would have to be changed to

>10 string language\ impress (imPRESS data)

In addition, in this version, if a pattern string contains a backslash,
it must be escaped. For example

0 string \begindata Andrew Toolkit document

in an existing magic file would have to be changed to

0 string \\begindata Andrew Toolkit document

SunOS releases 3.2 and later from Sun Microsystems include a file command
derived from the System V one, but with some extensions. This version
differs from Sun's only in minor ways. It includes the extension of the
‘&’ operator, used as, for example,

>16 long&0x7fffffff >0 not stripped

SECURITY
On systems where libseccomp (https://github.com/seccomp/libseccomp) is
available, file is enforces limiting system calls to only the ones neces‐
sary for the operation of the program. This enforcement does not provide
any security benefit when file is asked to decompress input files running
external programs with the -z option. To enable execution of external
decompressors, one needs to disable sandboxing using the -S option.

MAGIC DIRECTORY
The magic file entries have been collected from various sources, mainly
USENET, and contributed by various authors. Christos Zoulas (address be‐
low) will collect additional or corrected magic file entries. A consoli‐
dation of magic file entries will be distributed periodically.

The order of entries in the magic file is significant. Depending on what
system you are using, the order that they are put together may be incor‐
rect. If your old file command uses a magic file, keep the old magic
file around for comparison purposes (rename it to
/usr/share/misc/magic.orig).

HISTORY
There has been a file command in every UNIX since at least Research
Version 4 (man page dated November, 1973). The System V version intro‐
duced one significant major change: the external list of magic types.
This slowed the program down slightly but made it a lot more flexible.

This program, based on the System V version, was written by Ian Darwin
⟨ian@darwinsys.com⟩ without looking at anybody else's source code.

John Gilmore revised the code extensively, making it better than the
first version. Geoff Collyer found several inadequacies and provided
some magic file entries. Contributions of the ‘&’ operator by Rob McMa‐
hon, ⟨cudcv@warwick.ac.uk⟩, 1989.

Guy Harris, ⟨guy@netapp.com⟩, made many changes from 1993 to the present.

Primary development and maintenance from 1990 to the present by Christos
Zoulas ⟨christos@astron.com⟩.

Altered by Chris Lowth ⟨chris@lowth.com⟩, 2000: handle the -i option to
output mime type strings, using an alternative magic file and internal
logic.

Altered by Eric Fischer ⟨enf@pobox.com⟩, July, 2000, to identify charac‐
ter codes and attempt to identify the languages of non-ASCII files.

Altered by Reuben Thomas ⟨rrt@sc3d.org⟩, 2007-2011, to improve MIME sup‐
port, merge MIME and non-MIME magic, support directories as well as files
of magic, apply many bug fixes, update and fix a lot of magic, improve
the build system, improve the documentation, and rewrite the Python bind‐
ings in pure Python.

The list of contributors to the ‘magic’ directory (magic files) is too
long to include here. You know who you are; thank you. Many contribu‐
tors are listed in the source files.

LEGAL NOTICE
Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999. Covered by the
standard Berkeley Software Distribution copyright; see the file COPYING
in the source distribution.

The files tar.h and is_tar.c were written by John Gilmore from his pub‐
lic-domain tar(1) program, and are not covered by the above license.

BUGS
Please report bugs and send patches to the bug tracker at
https://bugs.astron.com/ or the mailing list at ⟨file@astron.com⟩ (visit
https://mailman.astron.com/mailman/listinfo/file first to subscribe).

TODO
Fix output so that tests for MIME and APPLE flags are not needed all over
the place, and actual output is only done in one place. This needs a de‐
sign. Suggestion: push possible outputs on to a list, then pick the
last-pushed (most specific, one hopes) value at the end, or use a default
if the list is empty. This should not slow down evaluation.

The handling of MAGIC_CONTINUE and printing \012- between entries is
clumsy and complicated; refactor and centralize.

Some of the encoding logic is hard-coded in encoding.c and can be moved
to the magic files if we had a !:charset annotation.

Continue to squash all magic bugs. See Debian BTS for a good source.

Store arbitrarily long strings, for example for %s patterns, so that they
can be printed out. Fixes Debian bug #271672. This can be done by allo‐
cating strings in a string pool, storing the string pool at the end of
the magic file and converting all the string pointers to relative offsets
from the string pool.

Add syntax for relative offsets after current level (Debian bug #466037).

Make file -ki work, i.e. give multiple MIME types.

Add a zip library so we can peek inside Office2007 documents to print
more details about their contents.

Add an option to print URLs for the sources of the file descriptions.

Combine script searches and add a way to map executable names to MIME
types (e.g. have a magic value for !:mime which causes the resulting
string to be looked up in a table). This would avoid adding the same
magic repeatedly for each new hash-bang interpreter.

When a file descriptor is available, we can skip and adjust the buffer
instead of the hacky buffer management we do now.

Fix “name” and “use” to check for consistency at compile time (duplicate
“name”, “use” pointing to undefined “name” ). Make “name” / “use” more
efficient by keeping a sorted list of names. Special-case ^ to flip en‐
dianness in the parser so that it does not have to be escaped, and docu‐
ment it.

If the offsets specified internally in the file exceed the buffer size (
HOWMANY variable in file.h), then we don't seek to that offset, but we
give up. It would be better if buffer managements was done when the file
descriptor is available so we can seek around the file. One must be
careful though because this has performance and thus security considera‐
tions, because one can slow down things by repeatedly seeking.

There is support now for keeping separate buffers and having offsets from
the end of the file, but the internal buffer management still needs an
overhaul.

AVAILABILITY
You can obtain the original author's latest version by anonymous FTP on
ftp.astron.com in the directory /pub/file/file-X.YZ.tar.gz.

BSD April 7, 2024 BSD

MAGIC(4)		 Интерфейсы ядра BSD Справочник		      MAGIC(4)

NAME
     magic — файл шаблонов magic команды file

DESCRIPTION
     Эта страница справочника документирует формат файлов magic, используемых
     командой file(1), версии 5.46.  Команда file(1) определяет тип файла,
     используя, помимо других тестов, проверку на наличие определённых
     “шаблонов magic”.  База данных этих “шаблонов magic” обычно находится в
     двоичном файле в /usr/share/misc/magic.mgc или в каталоге с исходными
     текстовыми фрагментами файлов шаблонов magic в /usr/share/misc/magic.
     База данных указывает, какие шаблоны нужно проверить, какое сообщение
     или тип MIME выводить, если определённый шаблон найден, и дополнительную
     информацию для извлечения из файла.

     Формат исходных фрагментных файлов, которые используются для создания
     этой базы данных, следующий: Каждая строка фрагментного файла указывает
     тест для выполнения.  Тест сравнивает данные, начиная с определённого
     смещения в файле, с байтовым значением, строкой или числовым значением.
     Если тест успешен, выводится сообщение.  Строка состоит из следующих
     полей:

     offset   Число, указывающее смещение (в байтах) в файле данных,
	      которые нужно протестировать.  Это смещение может быть
	      отрицательным, если оно:
	      •	  Первое прямое смещение записи magic (на уровне
		  продолжения 0), в этом случае оно интерпретируется как
		  смещение от конца файла в обратном направлении.  Это
		  работает только когда доступен дескриптор файла и это
		  обычный файл.
	      •	  Смещение продолжения относительно конца последнего поля
		  верхнего уровня (&).
	      Если смещение начинается с символа “+”, то все смещения
	      интерпретируются от начала файла (по умолчанию).

     type     Тип данных для тестирования.  Возможные значения:

	      byte	  Однобайтовое значение.

	      short	  Двухбайтовое значение в машинном порядке байтов.

	      long	  Четырёхбайтовое значение в машинном порядке байтов.

	      quad	  Восьмибайтовое значение в машинном порядке байтов.

	      float	  32-битное число с плавающей запятой одинарной точности
			  IEEE в машинном порядке байтов.

	      double	  64-битное число с плавающей запятой двойной точности
			  IEEE в машинном порядке байтов.

	      string	  Строка байтов.  Спецификация типа string может
			  опционально следовать /<width> и опционально набор
			  флагов /[bCcftTtWw]*.  Width ограничивает количество
			  копируемых символов.  Ноль означает все символы.
			  Поддерживаются следующие флаги:
			      b	 Принудительно тестировать как бинарный файл.
			      C	 Использовать регистронезависимое сопоставление
				 с учётом верхнего регистра: символы верхнего
				 регистра в шаблоне magic сопоставляются с
				 символами нижнего и верхнего регистра в цели,
				 в то время как символы нижнего регистра в
				 шаблоне magic сопоставляются только с
				 символами верхнего регистра в цели.
			      c	 Использовать регистронезависимое сопоставление
				 с учётом нижнего регистра: символы нижнего
				 регистра в шаблоне magic сопоставляются с
				 символами нижнего и верхнего регистра в цели,
				 в то время как символы верхнего регистра в
				 шаблоне magic сопоставляются только с
				 символами верхнего регистра в цели.  Для
				 полного сопоставления без учёта регистра
				 укажите и “c”, и “C”.
			      f	 Требовать, чтобы сопоставленная строка была
				 полным словом, а не частичным.
			      T	 Обрезать строку, т.е. удалять начальные и
				 конечные пробелы.
			      t	 Принудительно тестировать как текстовый файл.
			      W	 Сжимать пробелы в цели, которая должна
				 содержать хотя бы один символ пробела.  Если
				 в magic есть n последовательных пробелов, то
				 в цели должно быть как минимум n
				 последовательных пробелов для сопоставления.
			      w	 Считать каждый пробел в magic необязательным.
				 Перед печатью строки пробелы удаляются.

	      pstring	  Строка в стиле Pascal, где первый байт/short/int
			  интерпретируется как неподписанная длина.  Длина по
			  умолчанию байт и может быть указана как модификатор.
			  Поддерживаются следующие модификаторы:
			      B	 Длина в байтах (по умолчанию).
			      H	 Длина в 2 байтах, big endian.
			      h	 Длина в 2 байтах, little endian.
			      L	 Длина в 4 байтах, big endian.
			      l	 Длина в 4 байтах, little endian.
			      J	 Длина включает себя в счёт.
			  Строка не заканчивается NUL.  “J” используется
			  вместо более ценного “I”, потому что этот тип
			  длины является особенностью формата JPEG.

	      date	  Четырёхбайтовое значение, интерпретируемое как дата
			  UNIX.

	      qdate	  Восьмибайтовое значение, интерпретируемое как дата
			  UNIX.

	      ldate	  Четырёхбайтовое значение, интерпретируемое как дата
			  в стиле UNIX, но как локальное время, а не UTC.

	      qldate	  Восьмибайтовое значение, интерпретируемое как дата
			  в стиле UNIX, но как локальное время, а не UTC.

	      qwdate	  Восьмибайтовое значение, интерпретируемое как дата
			  в стиле Windows.

	      msdosdate	  Двухбайтовое значение, интерпретируемое как дата в
			  стиле FAT/DOS.

	      msdostime	  Двухбайтовое значение, интерпретируемое как время в
			  стиле FAT/DOS.

	      beid3	  32-битная длина ID3 в порядке байтов big-endian.

	      beshort	  Двухбайтовое значение в порядке байтов big-endian.

	      belong	  Четырёхбайтовое значение в порядке байтов big-endian.

	      bequad	  Восьмибайтовое значение в порядке байтов big-endian.

	      befloat	  32-битное число с плавающей запятой одинарной точности
			  IEEE в порядке байтов big-endian.

	      bedouble	  64-битное число с плавающей запятой двойной точности
			  IEEE в порядке байтов big-endian.

	      bedate	  Четырёхбайтовое значение в порядке байтов big-endian,
			  интерпретируемое как дата UNIX.

	      beqdate	  Восьмибайтовое значение в порядке байтов big-endian,
			  интерпретируемое как дата UNIX.

	      beldate	  Четырёхбайтовое значение в порядке байтов big-endian,
			  интерпретируемое как дата в стиле UNIX, но как
			  локальное время, а не UTC.

	      beqldate	  Восьмибайтовое значение в порядке байтов big-endian,
			  интерпретируемое как дата в стиле UNIX, но как
			  локальное время, а не UTC.

	      beqwdate	  Восьмибайтовое значение в порядке байтов big-endian,
			  интерпретируемое как дата в стиле Windows.

	      bemsdosdate
			  Двухбайтовое значение в порядке байтов big-endian,
			  интерпретируемое как дата в стиле FAT/DOS.

	      bemsdostime
			  Двухбайтовое значение в порядке байтов big-endian,
			  интерпретируемое как время в стиле FAT/DOS.

	      bestring16  Двухбайтовая строка unicode (UCS16) в порядке байтов
			  big-endian.

	      leid3	  32-битная длина ID3 в порядке байтов little-endian.

	      leshort	  Двухбайтовое значение в порядке байтов little-endian.

	      lelong	  Четырёхбайтовое значение в порядке байтов little-endian.

	      lequad	  Восьмибайтовое значение в порядке байтов little-endian.

	      lefloat	  32-битное число с плавающей запятой одинарной точности
			  IEEE в порядке байтов little-endian.

	      ledouble	  64-битное число с плавающей запятой двойной точности
			  IEEE в порядке байтов little-endian.

	      ledate	  Четырёхбайтовое значение в порядке байтов little-endian,
			  интерпретируемое как дата UNIX.

	      leqdate	  Восьмибайтовое значение в порядке байтов little-endian,
			  интерпретируемое как дата UNIX.

	      leldate	  Четырёхбайтовое значение в порядке байтов little-endian,
			  интерпретируемое как дата в стиле UNIX, но как
			  локальное время, а не UTC.

	      leqldate	  Восьмибайтовое значение в порядке байтов little-endian,
			  интерпретируемое как дата в стиле UNIX, но как
			  локальное время, а не UTC.

	      leqwdate	  Восьмибайтовое значение в порядке байтов little-endian,
			  интерпретируемое как дата в стиле Windows.

	      lemsdosdate
			  Двухбайтовое значение в порядке байтов big-endian,
			  интерпретируемое как дата в стиле FAT/DOS.

	      lemsdostime
			  Двухбайтовое значение в порядке байтов big-endian,
			  интерпретируемое как время в стиле FAT/DOS.

	      lestring16  Двухбайтовая строка unicode (UCS16) в порядке байтов
			  little-endian.

	      melong	  Четырёхбайтовое значение в порядке байтов middle-endian
			  (PDP-11).

	      medate	  Четырёхбайтовое значение в порядке байтов middle-endian
			  (PDP-11), интерпретируемое как дата UNIX.

	      meldate	  Четырёхбайтовое значение в порядке байтов middle-endian
			  (PDP-11), интерпретируемое как дата в стиле UNIX, но
			  как локальное время, а не UTC.

	      indirect	  Начиная с заданного смещения, снова обратиться к базе
			  данных magic.  Смещение косвенного magic по умолчанию
			  абсолютное в файле, но можно указать /r, чтобы
			  указать, что смещение относительно от начала записи.

	      name	  Определить “именованный” экземпляр magic, который можно
			  вызвать из другой записи use magic, как вызов
			  подпрограммы.  Прямые смещения именованных экземпляров
			  magic относительно смещения предыдущей совпавшей
			  записи, но косвенные смещения относительно начала
			  файла, как обычно.  Записи именованного magic всегда
			  совпадают.

	      use	  Рекурсивно вызвать именованный magic, начиная с
			  текущего смещения.  Если имя ссылки начинается с ^,
			  то порядок байтов magic переключается; если, например,
			  указано leshort, оно обрабатывается как beshort и
			  наоборот.  Это полезно, чтобы избежать дублирования
			  правил для разных порядков байтов.

	      regex	  Сопоставление регулярного выражения в расширенном
			  синтаксисе POSIX (как в egrep).  Регулярные выражения
			  могут требовать экспоненциального времени обработки,
			  и их производительность трудно предсказать, поэтому
			  их использование не рекомендуется.  При использовании
			  в производственных средах их производительность
			  следует тщательно проверять.  Размер строки для
			  поиска также следует ограничивать, указав /<length>,
			  чтобы избежать проблем с производительностью при
			  сканировании длинных файлов.  Спецификация типа может
			  также опционально следовать /[c][s][l].  Флаг “c”
			  делает сопоставление регистронезависимым, а флаг “s”
			  обновляет смещение до начального смещения
			  сопоставления, а не до конца.  Модификатор “l”
			  изменяет предел длины на количество строк вместо
			  количества байтов.  Строки разделяются родным
			  разделителем строк платформы.  При указании количества
			  строк также вычисляется неявное количество байтов,
			  предполагая, что каждая строка имеет длину 80
			  символов.  Если ни количество байтов, ни строк не
			  указано, поиск автоматически ограничивается 8KiB.  ^ и
			  $ сопоставляются с началом и концом отдельных строк,
			  соответственно, а не с началом и концом файла.

	      search	  Поиск буквальной строки, начиная с заданного смещения.
			  Можно использовать те же модификаторы, что и для
			  шаблонов string.  Выражение поиска должно содержать
			  диапазон в форме /number, то есть количество
			  позиций, в которых будет попытка сопоставления,
			  начиная с начального смещения.  Это подходит для
			  поиска больших бинарных выражений с переменными
			  смещениями, используя \ экраны для специальных
			  символов.  Порядок модификатора и числа не важен.

	      default	  Это предназначено для использования с тестом x (который
			  всегда истинен) и у него нет типа.  Он совпадает, когда
			  ни один другой тест на этом уровне продолжения не
			  совпал ранее.  Очистка совпавших тестов для уровня
			  продолжения можно выполнить с помощью теста clear.

	      clear	  Этот тест всегда истинен и сбрасывает флаг совпадения
			  для этого уровня продолжения.  Он предназначен для
			  использования с тестом default.

	      der	  Разобрать файл как файл сертификата DER.  Поле теста
			  используется как тип der, который нужно сопоставить.
			  Типы DER: eoc, bool, int, bit_str, octet_str, null,
			  obj_id, obj_desc, ext, real, enum, embed, utf8_str,
			  rel_oid, time, res2, seq, set, num_str, prt_str,
			  t61_str, vid_str, ia5_str, utc_time, gen_time,
			  gr_str, vis_str, gen_str, univ_str, char_str,
			  bmp_str, date, tod, datetime, duration, oid-iri,
			  rel-oid-iri.  Эти типы могут следовать необязательному
			  числовому размеру, который указывает ширину поля в
			  байтах.

	      guid	  Глобально уникальный идентификатор, разбираемый и
			  выводимый как XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.  Его
			  формат — строка.

	      offset	  Quad-значение, указывающее текущее смещение файла.  Оно
			  может использоваться для определения размера файла или
			  буфера magic.  Например, записи magic:

				-0	offset	x	this file is %lld bytes
				-0	offset	<=100	must be more than 100 \
				    bytes and is only %lld

	      octal	  Строка, представляющая восьмеричное число.

	      Для совместимости со стандартом Single UNIX, спецификаторы
	      типов dC и d1 эквивалентны byte, спецификаторы типов uC
	      и u1 эквивалентны ubyte, спецификаторы типов dS и d2
	      эквивалентны short, спецификаторы типов uS и u2
	      эквивалентны ushort, спецификаторы типов dI, dL и d4
	      эквивалентны long, спецификаторы типов uI, uL и u4
	      эквивалентны ulong, спецификатор типов d8 эквивалентен
	      quad, спецификатор типов u8 эквивалентен uquad, и
	      спецификатор типов s эквивалентен string.  Кроме того,
	      спецификатор типов dQ эквивалентен quad, а спецификатор
	      типов uQ эквивалентен uquad.

	      Каждый шаблон magic верхнего уровня (см. ниже для
	      объяснения уровней) классифицируется как текст или
	      бинарный в зависимости от используемых типов.  Типы
	      “regex” и “search” классифицируются как тесты текста,
	      если в шаблоне не используются непечатаемые символы.  Все
	      другие тесты классифицируются как бинарные.  Шаблон
	      верхнего уровня считается тестом текста, когда все его
	      шаблоны являются текстовыми; в противном случае он
	      считается бинарным.  При сопоставлении файла сначала
	      проверяются бинарные шаблоны; если совпадение не найдено,
	      и файл выглядит как текст, то определяется его кодировка
	      и проверяются текстовые шаблоны.

	      Числовые типы могут опционально следовать & и числовым
	      значением, чтобы указать, что значение должно быть
	      объединено с числовым значением с помощью AND перед
	      любыми сравнениями.  Добавление u к типу указывает, что
	      упорядоченные сравнения должны быть беззнаковыми.

     test     Значение для сравнения со значением из файла.  Если тип
	      числовой, это значение указывается в форме C; если это
	      строка, оно указывается как строка C с обычными экранами
	      (например, \n для новой строки).

	      Числовые значения могут предшествовать символом,
	      указывающим операцию.  Это может быть =, чтобы указать,
	      что значение из файла должно быть равно указанному
	      значению, <, чтобы указать, что значение из файла должно
	      быть меньше указанного значения, >, чтобы указать, что
	      значение из файла должно быть больше указанного значения,
	      &, чтобы указать, что значение из файла должно иметь все
	      установленные биты, которые установлены в указанном
	      значении, ^, чтобы указать, что значение из файла должно
	      иметь очищенные биты, которые установлены в указанном
	      значении, или ~, указанное после значение инвертируется
	      перед тестированием.  x, чтобы указать, что любое значение
	      подойдёт.  Если символ omitted, предполагается =.  Операторы
	      &, ^ и ~ не работают с числами с плавающей запятой и
	      double.  Оператор ! указывает, что строка совпадает, если
	      тест не успешен.

	      Числовые значения указываются в форме C; например, 13
	      — десятичное, 013 — восьмеричное, а 0x13 — шестнадцатеричное.

	      Числовые операции не выполняются над типами дат, вместо
	      этого числовое значение интерпретируется как смещение.

	      Для строковых значений строка из файла должна совпадать с
	      указанной строкой.  Операторы =, < и > (но не &) могут
	      применяться к строкам.  Используемая длина для
	      сопоставления — длина строкового аргумента в файле magic.
	      Это означает, что строка может совпадать с любой
	      непустой строкой (обычно используется для последующего
	      вывода строки), с >\0 (потому что все непустые строки
	      больше пустой строки).

	      Даты обрабатываются как числовые значения в
	      соответствующем внутреннем представлении.

	      Специальный тест x всегда оценивается как истинный.

     message  Сообщение для вывода, если сравнение успешно.  Если строка
	      содержит спецификацию формата printf(3), значение из файла
	      (с любыми указанными масками) выводится с использованием
	      сообщения как строки формата.  Если строка начинается с
	      “\b”, выводимое сообщение — остаток строки без добавления
	      пробелов перед ним: несколько совпадений обычно
	      разделяются одним пробелом.

     APPLE 4+4 символа APPLE creator и type можно указать как:

	   !:apple CREATYPE

     Список разделённых слэшами распространённых расширений имён файлов можно
     указать как:

	   !:ext   ext[/ext...]

     т.е. буквальная строка “!:ext”, за которой следует список расширений,
     разделённых слэшами; например, для изображений JPEG:

	   !:ext jpeg/jpg/jpe/jfif

     Тип MIME указывается на отдельной строке, которая должна быть следующей
     непустой или некомментарийной строкой после строки magic, идентифицирующей
     тип файла, и имеет следующий формат:

	   !:mime  MIMETYPE

     т.е. буквальная строка “!:mime”, за которой следует тип MIME.

     Опциональную силу можно указать на отдельной строке, которая относится к
     текущему описанию magic, используя следующий формат:

	   !:strength OP VALUE

     Операнд OP может быть: +, -, *, или / и VALUE — константа от 0 до 255.
     Эта константа применяется с использованием указанного операнда к
     текущему вычисленному значению по умолчанию силы magic.

     Некоторые форматы файлов содержат дополнительную информацию, которую нужно
     выводить вместе с типом файла или требуют дополнительных тестов для
     определения истинного типа файла.  Эти дополнительные тесты вводятся
     одним или несколькими символами > перед смещением.  Количество > на
     строке указывает уровень теста; строка без > в начале считается на уровне
     0.  Тесты организованы в иерархию, похожую на дерево: если тест на строке
     уровня n успешен, то все последующие тесты уровня n+1 выполняются, и
     сообщения выводятся, если тесты успешны, до появления строки с уровнем n
     или меньше.  Для более сложных файлов можно использовать пустые сообщения,
     чтобы получить эффект “if/then”, следующим образом:

	   0	  string    MZ
	   >0x18  uleshort  <0x40   MS-DOS executable
	   >0x18  uleshort  >0x3f   extended PC executable (e.g., MS Windows)

     Смещения не обязательно должны быть постоянными, но могут также
     читаться из проверяемого файла.  Если первый символ после последнего >
     — (, то строка после скобки интерпретируется как косвенное смещение.
     Это означает, что число после скобки используется как смещение в файле.
     Значение по этому смещению читается и используется снова как смещение в
     файле.  Косвенные смещения имеют форму: (x [[.,][bBcCeEfFgGhHiIlmosSqQ]][+-][ y ]).
     Значение x используется как смещение в файле.  По этому смещению читается
     байт, длина id3, short или long в зависимости от спецификатора типа
     [bBcCeEfFgGhHiIlLmsSqQ].  Значение считается знаковым, если указан “,”,
     или беззнаковым, если указан “.”.  Заглавные типы интерпретируют число как
     big endian, в то время как строчные версии интерпретируют число как little
     endian; тип m интерпретирует число как middle endian (PDP-11).  К этому
     числу добавляется значение y, и результат используется как смещение в
     файле.  Тип по умолчанию, если не указан, — long.  Распознаются следующие
     типы:

	   Type	   Sy Mnemonic	 Sy Endian Sy Size
	   bcBC	   Byte/Char	 N/A	   1
	   efg	   Double	 Little	   8
	   EFG	   Double	 Big	   8
	   hs	   Half/Short	 Little	   2
	   HS	   Half/Short	 Big	   2
	   i	   ID3		 Little	   4
	   I	   ID3		 Big	   4
	   l	   Long		 Little	   4
	   L	   Long		 Big	   4
	   m	   Middle	 Middle	   4
	   o	   Octal	 Textual   Variable
	   q	   Quad		 Little	   8
	   Q	   Quad		 Big	   8

     Таким образом можно проверить переменной длины структуры:

	   # MS Windows executables are also valid MS-DOS executables
	   0	       string	MZ
	   >0x18       uleshort <0x40  MZ executable (MS-DOS)
	   # skip the whole block below if it is not an extended executable
	   >0x18       uleshort >0x3f
	   >>(0x3c.l)  string	PE\0\0 PE executable (MS-Windows)
	   >>(0x3c.l)  string	LX\0\0 LX executable (OS/2)

     Эта стратегия проверки имеет недостаток: нужно убедиться, что в итоге
     что-то выводится, иначе пользователи могут получить пустой вывод (как в
     случае, если нет ни PE\0\0, ни LE\0\0 в приведённом выше примере).

     Если это косвенное смещение нельзя использовать напрямую, возможны
     простые расчёты: добавление [+-*/%&|^]number внутри скобок позволяет
     изменить значение, прочитанное из файла, перед использованием его как
     смещения:

	   # MS Windows executables are also valid MS-DOS executables
	   0	       string	MZ
	   # sometimes, the value at 0x18 is less that 0x40 but there's still an
	   # extended executable, simply appended to the file
	   >0x18       uleshort <0x40
	   >>(4.s*512) leshort	0x014c	COFF executable (MS-DOS, DJGPP)
	   >>(4.s*512) leshort	!0x014c MZ executable (MS-DOS)

     Иногда точное смещение неизвестно, так как оно зависит от длины или
     позиции (когда ранее использовалось косвенное смещение) предыдущих полей.
     Можно указать смещение относительно конца последнего поля верхнего уровня,
     используя ‘&’ как префикс смещения:

	   0	       string	MZ
	   >0x18       uleshort >0x3f
	   >>(0x3c.l)  string	PE\0\0	  PE executable (MS-Windows)
	   # immediately following the PE signature is the CPU type
	   >>>&0       leshort	0x14c	  for Intel 80386
	   >>>&0       leshort	0x8664	  for x86-64
	   >>>&0       leshort	0x184	  for DEC Alpha

     Косвенные и относительные смещения можно комбинировать:

	   0		 string	  MZ
	   >0x18	 uleshort <0x40
	   >>(4.s*512)	 leshort  !0x014c MZ executable (MS-DOS)
	   # if it's not COFF, go back 512 bytes and add the offset taken
	   # from byte 2/3, which is yet another way of finding the start
	   # of the extended executable
	   >>>&(2.s-514) string	  LE	  LE executable (MS Windows VxD driver)

     Или наоборот:

	   0		     string   MZ
	   >0x18	     uleshort >0x3f
	   >>(0x3c.l)	     string   LE\0\0  LE executable (MS-Windows)
	   # at offset 0x80 (-4, since relative offsets start at the end
	   # of the up-level match) inside the LE header, we find the absolute
	   # offset to the code area, where we look for a specific signature
	   >>>(&0x7c.l+0x26) string   UPX     \b, UPX compressed

     Или даже оба!

	   0		    string   MZ
	   >0x18	    uleshort >0x3f
	   >>(0x3c.l)	    string   LE\0\0 LE executable (MS-Windows)
	   # at offset 0x58 inside the LE header, we find the relative offset
	   # to a data area where we look for a specific signature
	   >>>&(&0x54.l-3)  string   UNACE  \b, ACE self-extracting archive

     Если нужно работать с парами смещение/длина в файле, даже второе значение
     в выражении в скобках можно взять из самого файла, используя другой набор
     скобок.  Обратите внимание, что это дополнительное косвенное смещение
     всегда относительно начала основного косвенного смещения.

	   0		     string	  MZ
	   >0x18	     uleshort	  >0x3f
	   >>(0x3c.l)	     string	  PE\0\0 PE executable (MS-Windows)
	   # search for the PE section called ".idata"...
	   >>>&0xf4	     search/0x140 .idata
	   # ...and go to the end of it, calculated from start+length;
	   # these are located 14 and 10 bytes after the section name
	   >>>>(&0xe.l+(-4)) string	  PK\3\4 \b, ZIP self-extracting archive

     Если у вас есть список известных значений на определённом уровне
     продолжения и вы хотите предоставить переключатель-подобный случай по
     умолчанию:

	   # clear that continuation level match
	   >18	   clear   x
	   >18	   lelong  1	   one
	   >18	   lelong  2	   two
	   >18	   default x
	   # print default match
	   >>18	   lelong  x	   unmatched 0x%x

SEE ALSO
     file(1) — команда, которая читает этот файл.

BUGS
     Форматы long, belong, lelong, melong, short, beshort и leshort не
     зависят от длины типов данных C short и long на платформе, даже если
     стандарт Single UNIX подразумевает обратное.  Однако, поскольку OS X
     Mountain Lion прошла проверку на соответствие стандарту Single UNIX и
     поставляет версию file(1), в которой они не зависят от размеров типов
     данных C и которая построена для 64-битной среды, в которой long имеет
     размер 8 байтов, а не 4 байта, предполагается, что проверочный набор не
     тестирует, например, является ли long элементом с тем же размером, что и
     тип данных C long.  Вероятно, должны быть имена типов int8, uint8, int16,
     uint16, int32, uint32, int64 и uint64, а также их варианты с указанным
     порядком байтов, чтобы сделать это более ясным.

BSD			       27 ноября 2024			   BSD

MAGIC(4)		 BSD Kernel Interfaces Manual		      MAGIC(4)

NAME
     magic — file command's magic pattern file

DESCRIPTION
     This manual page documents the format of magic files as used by the
     file(1) command, version 5.46.  The file(1) command identifies the type
     of a file using, among other tests, a test for whether the file contains
     certain “magic patterns”.	The database of these “magic patterns” is usu‐
     ally located in a binary file in /usr/share/misc/magic.mgc or a directory
     of source text magic pattern fragment files in /usr/share/misc/magic.
     The database specifies what patterns are to be tested for, what message
     or MIME type to print if a particular pattern is found, and additional
     information to extract from the file.

     The format of the source fragment files that are used to build this data‐
     base is as follows: Each line of a fragment file specifies a test to be
     performed.	 A test compares the data starting at a particular offset in
     the file with a byte value, a string or a numeric value.  If the test
     succeeds, a message is printed.  The line consists of the following
     fields:

     offset   A number specifying the offset (in bytes) into the file of the
	      data which is to be tested.  This offset can be a negative num‐
	      ber if it is:
	      •	  The first direct offset of the magic entry (at continuation
		  level 0), in which case it is interpreted an offset from end
		  end of the file going backwards.  This works only when a
		  file descriptor to the file is available and it is a regular
		  file.
	      •	  A continuation offset relative to the end of the last up-
		  level field (&).
	      If the offset starts with the symbol “+”, then all offsets are
	      interpreted as from the beginning of the file (the default).

     type     The type of the data to be tested.  The possible values are:

	      byte	  A one-byte value.

	      short	  A two-byte value in this machine's native byte or‐
			  der.

	      long	  A four-byte value in this machine's native byte or‐
			  der.

	      quad	  An eight-byte value in this machine's native byte
			  order.

	      float	  A 32-bit single precision IEEE floating point number
			  in this machine's native byte order.

	      double	  A 64-bit double precision IEEE floating point number
			  in this machine's native byte order.

	      string	  A string of bytes.  The string type specification
			  can be optionally followed by a /<width> option and
			  optionally followed by a set of flags /[bCcftTtWw]*.
			  The width limits the number of characters to be
			  copied.  Zero means all characters.  The following
			  flags are supported:
			      b	 Force binary file test.
			      C	 Use upper case insensitive matching: upper
				 case characters in the magic match both lower
				 and upper case characters in the target,
				 whereas lower case characters in the magic
				 only match upper case characters in the tar‐
				 get.
			      c	 Use lower case insensitive matching: lower
				 case characters in the magic match both lower
				 and upper case characters in the target,
				 whereas upper case characters in the magic
				 only match upper case characters in the tar‐
				 get.  To do a complete case insensitive
				 match, specify both “c” and “C”.
			      f	 Require that the matched string is a full
				 word, not a partial word match.
			      T	 Trim the string, i.e. leading and trailing
				 whitespace
			      t	 Force text file test.
			      W	 Compact whitespace in the target, which must
				 contain at least one whitespace character.
				 If the magic has n consecutive blanks, the
				 target needs at least n consecutive blanks to
				 match.
			      w	 Treat every blank in the magic as an optional
				 blank.	 is deleted before the string is
				 printed.

	      pstring	  A Pascal-style string where the first byte/short/int
			  is interpreted as the unsigned length.  The length
			  defaults to byte and can be specified as a modifier.
			  The following modifiers are supported:
			      B	 A byte length (default).
			      H	 A 2 byte big endian length.
			      h	 A 2 byte little endian length.
			      L	 A 4 byte big endian length.
			      l	 A 4 byte little endian length.
			      J	 The length includes itself in its count.
			  The string is not NUL terminated.  “J” is used
			  rather than the more valuable “I” because this type
			  of length is a feature of the JPEG format.

	      date	  A four-byte value interpreted as a UNIX date.

	      qdate	  An eight-byte value interpreted as a UNIX date.

	      ldate	  A four-byte value interpreted as a UNIX-style date,
			  but interpreted as local time rather than UTC.

	      qldate	  An eight-byte value interpreted as a UNIX-style
			  date, but interpreted as local time rather than UTC.

	      qwdate	  An eight-byte value interpreted as a Windows-style
			  date.

	      msdosdate	  A two-byte value interpreted as FAT/DOS-style date.

	      msdostime	  A two-byte value interpreted as FAT/DOS-style time.

	      beid3	  A 32-bit ID3 length in big-endian byte order.

	      beshort	  A two-byte value in big-endian byte order.

	      belong	  A four-byte value in big-endian byte order.

	      bequad	  An eight-byte value in big-endian byte order.

	      befloat	  A 32-bit single precision IEEE floating point number
			  in big-endian byte order.

	      bedouble	  A 64-bit double precision IEEE floating point number
			  in big-endian byte order.

	      bedate	  A four-byte value in big-endian byte order, inter‐
			  preted as a Unix date.

	      beqdate	  An eight-byte value in big-endian byte order, inter‐
			  preted as a Unix date.

	      beldate	  A four-byte value in big-endian byte order, inter‐
			  preted as a UNIX-style date, but interpreted as lo‐
			  cal time rather than UTC.

	      beqldate	  An eight-byte value in big-endian byte order, inter‐
			  preted as a UNIX-style date, but interpreted as lo‐
			  cal time rather than UTC.

	      beqwdate	  An eight-byte value in big-endian byte order, inter‐
			  preted as a Windows-style date.

	      bemsdosdate
			  A two-byte value in big-endian byte order, inter‐
			  preted as FAT/DOS-style date.

	      bemsdostime
			  A two-byte value in big-endian byte order, inter‐
			  preted as FAT/DOS-style time.

	      bestring16  A two-byte unicode (UCS16) string in big-endian byte
			  order.

	      leid3	  A 32-bit ID3 length in little-endian byte order.

	      leshort	  A two-byte value in little-endian byte order.

	      lelong	  A four-byte value in little-endian byte order.

	      lequad	  An eight-byte value in little-endian byte order.

	      lefloat	  A 32-bit single precision IEEE floating point number
			  in little-endian byte order.

	      ledouble	  A 64-bit double precision IEEE floating point number
			  in little-endian byte order.

	      ledate	  A four-byte value in little-endian byte order, in‐
			  terpreted as a UNIX date.

	      leqdate	  An eight-byte value in little-endian byte order, in‐
			  terpreted as a UNIX date.

	      leldate	  A four-byte value in little-endian byte order, in‐
			  terpreted as a UNIX-style date, but interpreted as
			  local time rather than UTC.

	      leqldate	  An eight-byte value in little-endian byte order, in‐
			  terpreted as a UNIX-style date, but interpreted as
			  local time rather than UTC.

	      leqwdate	  An eight-byte value in little-endian byte order, in‐
			  terpreted as a Windows-style date.

	      lemsdosdate
			  A two-byte value in big-endian byte order, inter‐
			  preted as FAT/DOS-style date.

	      lemsdostime
			  A two-byte value in big-endian byte order, inter‐
			  preted as FAT/DOS-style time.

	      lestring16  A two-byte unicode (UCS16) string in little-endian
			  byte order.

	      melong	  A four-byte value in middle-endian (PDP-11) byte or‐
			  der.

	      medate	  A four-byte value in middle-endian (PDP-11) byte or‐
			  der, interpreted as a UNIX date.

	      meldate	  A four-byte value in middle-endian (PDP-11) byte or‐
			  der, interpreted as a UNIX-style date, but inter‐
			  preted as local time rather than UTC.

	      indirect	  Starting at the given offset, consult the magic
			  database again.  The offset of the indirect magic is
			  by default absolute in the file, but one can specify
			  /r to indicate that the offset is relative from the
			  beginning of the entry.

	      name	  Define a “named” magic instance that can be called
			  from another use magic entry, like a subroutine
			  call.	 Named instance direct magic offsets are rela‐
			  tive to the offset of the previous matched entry,
			  but indirect offsets are relative to the beginning
			  of the file as usual.	 Named magic entries always
			  match.

	      use	  Recursively call the named magic starting from the
			  current offset.  If the name of the referenced be‐
			  gins with a ^ then the endianness of the magic is
			  switched; if the magic mentioned leshort for exam‐
			  ple, it is treated as beshort and vice versa.	 This
			  is useful to avoid duplicating the rules for differ‐
			  ent endianness.

	      regex	  A regular expression match in extended POSIX regular
			  expression syntax (like egrep).  Regular expressions
			  can take exponential time to process, and their per‐
			  formance is hard to predict, so their use is dis‐
			  couraged.  When used in production environments,
			  their performance should be carefully checked.  The
			  size of the string to search should also be limited
			  by specifying /<length>, to avoid performance issues
			  scanning long files.	The type specification can
			  also be optionally followed by /[c][s][l].  The “c”
			  flag makes the match case insensitive, while the “s”
			  flag update the offset to the start offset of the
			  match, rather than the end.  The “l” modifier,
			  changes the limit of length to mean number of lines
			  instead of a byte count.  Lines are delimited by the
			  platforms native line delimiter.  When a line count
			  is specified, an implicit byte count also computed
			  assuming each line is 80 characters long.  If nei‐
			  ther a byte or line count is specified, the search
			  is limited automatically to 8KiB.  ^ and $ match the
			  beginning and end of individual lines, respectively,
			  not beginning and end of file.

	      search	  A literal string search starting at the given off‐
			  set.	The same modifier flags can be used as for
			  string patterns.  The search expression must contain
			  the range in the form /number, that is the number of
			  positions at which the match will be attempted,
			  starting from the start offset.  This is suitable
			  for searching larger binary expressions with vari‐
			  able offsets, using \ escapes for special charac‐
			  ters.	 The order of modifier and number is not rele‐
			  vant.

	      default	  This is intended to be used with the test x (which
			  is always true) and it has no type.  It matches when
			  no other test at that continuation level has matched
			  before.  Clearing that matched tests for a continua‐
			  tion level, can be done using the clear test.

	      clear	  This test is always true and clears the match flag
			  for that continuation level.	It is intended to be
			  used with the default test.

	      der	  Parse the file as a DER Certificate file.  The test
			  field is used as a der type that needs to be
			  matched.  The DER types are: eoc, bool, int,
			  bit_str, octet_str, null, obj_id, obj_desc, ext,
			  real, enum, embed, utf8_str, rel_oid, time, res2,
			  seq, set, num_str, prt_str, t61_str, vid_str,
			  ia5_str, utc_time, gen_time, gr_str, vis_str,
			  gen_str, univ_str, char_str, bmp_str, date, tod,
			  datetime, duration, oid-iri, rel-oid-iri.  These
			  types can be followed by an optional numeric size,
			  which indicates the field width in bytes.

	      guid	  A Globally Unique Identifier, parsed and printed as
			  XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.	 It's format
			  is a string.

	      offset	  This is a quad value indicating the current offset
			  of the file.	It can be used to determine the size
			  of the file or the magic buffer.  For example the
			  magic entries:

				-0	offset	x	this file is %lld bytes
				-0	offset	<=100	must be more than 100 \
				    bytes and is only %lld

	      octal	  A string representing an octal number.

	      For compatibility with the Single UNIX Standard, the type speci‐
	      fiers dC and d1 are equivalent to byte, the type specifiers uC
	      and u1 are equivalent to ubyte, the type specifiers dS and d2
	      are equivalent to short, the type specifiers uS and u2 are
	      equivalent to ushort, the type specifiers dI, dL, and d4 are
	      equivalent to long, the type specifiers uI, uL, and u4 are
	      equivalent to ulong, the type specifier d8 is equivalent to
	      quad, the type specifier u8 is equivalent to uquad, and the type
	      specifier s is equivalent to string.  In addition, the type
	      specifier dQ is equivalent to quad and the type specifier uQ is
	      equivalent to uquad.

	      Each top-level magic pattern (see below for an explanation of
	      levels) is classified as text or binary according to the types
	      used.  Types “regex” and “search” are classified as text tests,
	      unless non-printable characters are used in the pattern.	All
	      other tests are classified as binary.  A top-level pattern is
	      considered to be a test text when all its patterns are text pat‐
	      terns; otherwise, it is considered to be a binary pattern.  When
	      matching a file, binary patterns are tried first; if no match is
	      found, and the file looks like text, then its encoding is deter‐
	      mined and the text patterns are tried.

	      The numeric types may optionally be followed by & and a numeric
	      value, to specify that the value is to be AND'ed with the nu‐
	      meric value before any comparisons are done.  Prepending a u to
	      the type indicates that ordered comparisons should be unsigned.

     test     The value to be compared with the value from the file.  If the
	      type is numeric, this value is specified in C form; if it is a
	      string, it is specified as a C string with the usual escapes
	      permitted (e.g. \n for new-line).

	      Numeric values may be preceded by a character indicating the op‐
	      eration to be performed.	It may be =, to specify that the value
	      from the file must equal the specified value, <, to specify that
	      the value from the file must be less than the specified value,
	      >, to specify that the value from the file must be greater than
	      the specified value, &, to specify that the value from the file
	      must have set all of the bits that are set in the specified
	      value, ^, to specify that the value from the file must have
	      clear any of the bits that are set in the specified value, or ~,
	      the value specified after is negated before tested.  x, to spec‐
	      ify that any value will match.  If the character is omitted, it
	      is assumed to be =.  Operators &, ^, and ~ don't work with
	      floats and doubles.  The operator ! specifies that the line
	      matches if the test does not succeed.

	      Numeric values are specified in C form; e.g.  13 is decimal, 013
	      is octal, and 0x13 is hexadecimal.

	      Numeric operations are not performed on date types, instead the
	      numeric value is interpreted as an offset.

	      For string values, the string from the file must match the spec‐
	      ified string.  The operators =, < and > (but not &) can be ap‐
	      plied to strings.	 The length used for matching is that of the
	      string argument in the magic file.  This means that a line can
	      match any non-empty string (usually used to then print the
	      string), with >\0 (because all non-empty strings are greater
	      than the empty string).

	      Dates are treated as numerical values in the respective internal
	      representation.

	      The special test x always evaluates to true.

     message  The message to be printed if the comparison succeeds.  If the
	      string contains a printf(3) format specification, the value from
	      the file (with any specified masking performed) is printed using
	      the message as the format string.	 If the string begins with
	      “\b”, the message printed is the remainder of the string with no
	      whitespace added before it: multiple matches are normally sepa‐
	      rated by a single space.

     An APPLE 4+4 character APPLE creator and type can be specified as:

	   !:apple CREATYPE

     A slash-separated list of commonly found filename extensions can be spec‐
     ified as:

	   !:ext   ext[/ext...]

     i.e. the literal string “!:ext” followed by a slash-separated list of
     commonly found extensions; for example for JPEG images:

	   !:ext jpeg/jpg/jpe/jfif

     A MIME type is given on a separate line, which must be the next non-blank
     or comment line after the magic line that identifies the file type, and
     has the following format:

	   !:mime  MIMETYPE

     i.e. the literal string “!:mime” followed by the MIME type.

     An optional strength can be supplied on a separate line which refers to
     the current magic description using the following format:

	   !:strength OP VALUE

     The operand OP can be: +, -, *, or / and VALUE is a constant between 0
     and 255.  This constant is applied using the specified operand to the
     currently computed default magic strength.

     Some file formats contain additional information which is to be printed
     along with the file type or need additional tests to determine the true
     file type.	 These additional tests are introduced by one or more > char‐
     acters preceding the offset.  The number of > on the line indicates the
     level of the test; a line with no > at the beginning is considered to be
     at level 0.  Tests are arranged in a tree-like hierarchy: if the test on
     a line at level n succeeds, all following tests at level n+1 are per‐
     formed, and the messages printed if the tests succeed, until a line with
     level n (or less) appears.	 For more complex files, one can use empty
     messages to get just the "if/then" effect, in the following way:

	   0	  string    MZ
	   >0x18  uleshort  <0x40   MS-DOS executable
	   >0x18  uleshort  >0x3f   extended PC executable (e.g., MS Windows)

     Offsets do not need to be constant, but can also be read from the file
     being examined.  If the first character following the last > is a ( then
     the string after the parenthesis is interpreted as an indirect offset.
     That means that the number after the parenthesis is used as an offset in
     the file.	The value at that offset is read, and is used again as an off‐
     set in the file.  Indirect offsets are of the form: (x
     [[.,][bBcCeEfFgGhHiIlmosSqQ]][+-][ y ]).  The value of x is used as an
     offset in the file.  A byte, id3 length, short or long is read at that
     offset depending on the [bBcCeEfFgGhHiIlLmsSqQ] type specifier.  The
     value is treated as signed if “,” is specified or unsigned if “.” is
     specified.	 The capitalized types interpret the number as a big endian
     value, whereas the small letter versions interpret the number as a little
     endian value; the m type interprets the number as a middle endian
     (PDP-11) value.  To that number the value of y is added and the result is
     used as an offset in the file.  The default type if one is not specified
     is long.  The following types are recognized:

	   Type	   Sy Mnemonic	 Sy Endian Sy Size
	   bcBC	   Byte/Char	 N/A	   1
	   efg	   Double	 Little	   8
	   EFG	   Double	 Big	   8
	   hs	   Half/Short	 Little	   2
	   HS	   Half/Short	 Big	   2
	   i	   ID3		 Little	   4
	   I	   ID3		 Big	   4
	   l	   Long		 Little	   4
	   L	   Long		 Big	   4
	   m	   Middle	 Middle	   4
	   o	   Octal	 Textual   Variable
	   q	   Quad		 Little	   8
	   Q	   Quad		 Big	   8

     That way variable length structures can be examined:

	   # MS Windows executables are also valid MS-DOS executables
	   0	       string	MZ
	   >0x18       uleshort <0x40  MZ executable (MS-DOS)
	   # skip the whole block below if it is not an extended executable
	   >0x18       uleshort >0x3f
	   >>(0x3c.l)  string	PE\0\0 PE executable (MS-Windows)
	   >>(0x3c.l)  string	LX\0\0 LX executable (OS/2)

     This strategy of examining has a drawback: you must make sure that you
     eventually print something, or users may get empty output (such as when
     there is neither PE\0\0 nor LE\0\0 in the above example).

     If this indirect offset cannot be used directly, simple calculations are
     possible: appending [+-*/%&|^]number inside parentheses allows one to
     modify the value read from the file before it is used as an offset:

	   # MS Windows executables are also valid MS-DOS executables
	   0	       string	MZ
	   # sometimes, the value at 0x18 is less that 0x40 but there's still an
	   # extended executable, simply appended to the file
	   >0x18       uleshort <0x40
	   >>(4.s*512) leshort	0x014c	COFF executable (MS-DOS, DJGPP)
	   >>(4.s*512) leshort	!0x014c MZ executable (MS-DOS)

     Sometimes you do not know the exact offset as this depends on the length
     or position (when indirection was used before) of preceding fields.  You
     can specify an offset relative to the end of the last up-level field us‐
     ing ‘&’ as a prefix to the offset:

	   0	       string	MZ
	   >0x18       uleshort >0x3f
	   >>(0x3c.l)  string	PE\0\0	  PE executable (MS-Windows)
	   # immediately following the PE signature is the CPU type
	   >>>&0       leshort	0x14c	  for Intel 80386
	   >>>&0       leshort	0x8664	  for x86-64
	   >>>&0       leshort	0x184	  for DEC Alpha

     Indirect and relative offsets can be combined:

	   0		 string	  MZ
	   >0x18	 uleshort <0x40
	   >>(4.s*512)	 leshort  !0x014c MZ executable (MS-DOS)
	   # if it's not COFF, go back 512 bytes and add the offset taken
	   # from byte 2/3, which is yet another way of finding the start
	   # of the extended executable
	   >>>&(2.s-514) string	  LE	  LE executable (MS Windows VxD driver)

     Or the other way around:

	   0		     string   MZ
	   >0x18	     uleshort >0x3f
	   >>(0x3c.l)	     string   LE\0\0  LE executable (MS-Windows)
	   # at offset 0x80 (-4, since relative offsets start at the end
	   # of the up-level match) inside the LE header, we find the absolute
	   # offset to the code area, where we look for a specific signature
	   >>>(&0x7c.l+0x26) string   UPX     \b, UPX compressed

     Or even both!

	   0		    string   MZ
	   >0x18	    uleshort >0x3f
	   >>(0x3c.l)	    string   LE\0\0 LE executable (MS-Windows)
	   # at offset 0x58 inside the LE header, we find the relative offset
	   # to a data area where we look for a specific signature
	   >>>&(&0x54.l-3)  string   UNACE  \b, ACE self-extracting archive

     If you have to deal with offset/length pairs in your file, even the sec‐
     ond value in a parenthesized expression can be taken from the file it‐
     self, using another set of parentheses.  Note that this additional indi‐
     rect offset is always relative to the start of the main indirect offset.

	   0		     string	  MZ
	   >0x18	     uleshort	  >0x3f
	   >>(0x3c.l)	     string	  PE\0\0 PE executable (MS-Windows)
	   # search for the PE section called ".idata"...
	   >>>&0xf4	     search/0x140 .idata
	   # ...and go to the end of it, calculated from start+length;
	   # these are located 14 and 10 bytes after the section name
	   >>>>(&0xe.l+(-4)) string	  PK\3\4 \b, ZIP self-extracting archive

     If you have a list of known values at a particular continuation level,
     and you want to provide a switch-like default case:

	   # clear that continuation level match
	   >18	   clear   x
	   >18	   lelong  1	   one
	   >18	   lelong  2	   two
	   >18	   default x
	   # print default match
	   >>18	   lelong  x	   unmatched 0x%x

SEE ALSO
     file(1) - the command that reads this file.

BUGS
     The formats long, belong, lelong, melong, short, beshort, and leshort do
     not depend on the length of the C data types short and long on the plat‐
     form, even though the Single UNIX Specification implies that they do.
     However, as OS X Mountain Lion has passed the Single UNIX Specification
     validation suite, and supplies a version of file(1) in which they do not
     depend on the sizes of the C data types and that is built for a 64-bit
     environment in which long is 8 bytes rather than 4 bytes, presumably the
     validation suite does not test whether, for example long refers to an
     item with the same size as the C data type long.  There should probably
     be type names int8, uint8, int16, uint16, int32, uint32, int64, and
     uint64, and specified-byte-order variants of them, to make it clearer
     that those types have specified widths.

BSD			       November 27, 2024			   BSD

Пакет: file

Подпакеты

Зависимости

Граф зависимостей

История изменений

Файлы пакета

/usr

/usr/bin

/usr/share

/usr/share/man

/usr/share/man/man1

/usr/share/man/man4

Документация (man-страницы)

file.1

magic.4