Science and technology

Query your Linux working system like a database

Linux affords a whole lot of instructions to assist customers collect details about their host working system: itemizing information or directories to verify attributes; querying to see what packages are put in, processes are operating, and providers begin at boot; or studying concerning the system’s hardware.

Each command makes use of its personal output format to listing this data. You want to make use of instruments like grep, sed, and awk to filter the outcomes to seek out particular data. Also, a whole lot of this data modifications incessantly, resulting in modifications within the system’s state.

It can be useful to view all of this data formatted just like the output of a database SQL question. Imagine that you possibly can question the output of the ps and rpm instructions as in the event you had been querying an SQL database desk with related names.

Fortunately, there’s a software that does simply that and far more: Osquery is an open source “SQL powered operating system instrumentation, monitoring, and analytics framework.”

Many functions that deal with safety, DevOps, compliance, and stock administration (to call just a few) depend on the core functionalities offered by Osquery at their coronary heart.

Install Osquery

Osquery is obtainable for Linux, macOS, Windows, and FreeBSD. Install the most recent model in your working system by following its installation instructions. (I will use model Four.7.Zero in these examples.)

After set up, confirm it is working:

$ rpm -qa | grep osquery
osquery-Four.7.Zero-1.linux.x86_64
$
$ osqueryi --version
osqueryi model Four.7.Zero
$

Osquery parts

Osquery has two principal parts:

  • osqueri is an interactive SQL question console. It is a standalone utility that doesn’t want super-user privileges (until you’re querying tables that want that degree of entry).
  • osqueryd is sort of a monitoring daemon for the host it’s put in on. This daemon can schedule queries to execute at common intervals to collect data from the infrastructure.

You can run the osqueri utility with out having the osqueryd daemon operating. Another utility, osqueryctl, controls beginning, stopping, and checking the standing of the daemon.

$ rpm -ql osquery-Four.eight.Zero-1.linux.x86_64 | grep bin
/usr/bin/osqueryctl
/usr/bin/osqueryd
/usr/bin/osqueryi
$

Use the osqueryi interactive immediate

You work together with Osquery very like you’d use an SQL database. In reality, osqueryi is a modified model of the SQLite shell. Running the osqueryi command drops you into an interactive shell the place you’ll be able to run instructions particular to Osquery, which regularly begin with a .:

$ osqueryi
Using a digital database. Need assist, sort '.assist'
osquery>

To stop the interactive shell, run the .stop command to get again to the working system’s shell:


Find out what tables can be found

As talked about, Osquery makes information accessible because the output of SQL queries. Information in databases is usually saved in tables. But how will you question these tables if you do not know their names? Well, you’ll be able to run the .tables command to listing all of the tables which you could question. If you’re a long-time Linux consumer or a sysadmin, the desk names will likely be acquainted, as you’ve gotten been utilizing working system instructions to get this data:

osquery> .tables
  => acpi_tables
  => apparmor_events
  => apparmor_profiles
  => apt_sources

<< snip >>

  => arp_cache
  => user_ssh_keys
  => customers
  => yara
  => yara_events
  => ycloud_instance_metadata
  => yum_sources
osquery>

Check the schema for particular person tables

Now that you already know the desk names, you’ll be able to see what data every desk supplies. As an instance, select processes, because the ps command is used very often to get this data. Run the .schema command adopted by the desk identify to see what data is saved on this desk. If you wish to verify the outcomes, you possibly can rapidly run ps -ef or ps aux and evaluate the output with the contents of the desk:

osquery> .schema processes
CREATE TABLE processes(`pid` BIGINT, `identify` TEXT, `path` TEXT, `cmdline` TEXT, `state` TEXT, `cwd` TEXT, `root` TEXT, `uid` BIGINT, `gid` BIGINT, `euid` BIGINT, `egid` BIGINT, `suid` BIGINT, `sgid` BIGINT, `on_disk` INTEGER, `wired_size` BIGINT, `resident_size` BIGINT, `total_size` BIGINT, `user_time` BIGINT, `system_time` BIGINT, `disk_bytes_read` BIGINT, `disk_bytes_written` BIGINT, `start_time` BIGINT, `mum or dad` BIGINT, `pgroup` BIGINT, `threads` INTEGER, `good` INTEGER, `is_elevated_token` INTEGER HIDDEN, `elapsed_time` BIGINT HIDDEN, `handle_count` BIGINT HIDDEN, `percent_processor_time` BIGINT HIDDEN, `upid` BIGINT HIDDEN, `uppid` BIGINT HIDDEN, `cpu_type` INTEGER HIDDEN, `cpu_subtype` INTEGER HIDDEN, `phys_footprint` BIGINT HIDDEN, PRIMARY KEY (`pid`)) WITHOUT ROWID;
osquery>

To drive house the purpose, use the next command to see the schema for the RPM packages and evaluate the data with rpm -qa and rpm -qi working system instructions:

osquery>
osquery> .schema rpm_packages
CREATE TABLE rpm_packages(`identify` TEXT, `model` TEXT, `launch` TEXT, `supply` TEXT, `measurement` BIGINT, `sha1` TEXT, `arch` TEXT, `epoch` INTEGER, `install_time` INTEGER, `vendor` TEXT, `package_group` TEXT, `pid_with_namespace` INTEGER HIDDEN, `mount_namespace_id` TEXT HIDDEN, PRIMARY KEY (`identify`, `model`, `launch`, `arch`, `epoch`, `pid_with_namespace`)) WITHOUT ROWID;
osquery>

You study extra in Osquery’s tables documentation.

Use the PRAGMA command

In case that schema data is just too cryptic for you, there may be one other strategy to print the desk data in a verbose, tabular format: the PRAGMA command. For instance, I will use PRAGMA to see data for the rpm_packages desk in a pleasant format:

osquery> PRAGMA table_info(rpm_packages);

One advantage of this tabular data is which you could deal with the sector you wish to question and see the kind of data that it supplies:

osquery> PRAGMA table_info(customers);
+-----+-------------+--------+---------+------------+----+
| cid | identify        | sort   | notnull | dflt_value | pk |
+-----+-------------+--------+---------+------------+----+
| Zero   | uid         | BIGINT | 1       |            | 1  |
| 1   | gid         | BIGINT | Zero       |            | Zero  |
| 2   | uid_signed  | BIGINT | Zero       |            | Zero  |
| three   | gid_signed  | BIGINT | Zero       |            | Zero  |
| Four   | username    | TEXT   | 1       |            | 2  |
| 5   | description | TEXT   | Zero       |            | Zero  |
| 6   | listing   | TEXT   | Zero       |            | Zero  |
| 7   | shell       | TEXT   | Zero       |            | Zero  |
| eight   | uuid        | TEXT   | 1       |            | three  |
+-----+-------------+--------+---------+------------+----+
osquery>

Run your first question

Now that you’ve all of the required data from the desk, the schema, and the objects to question, run your first SQL question to view the data. The question under returns the customers which are current on the system and every one’s consumer ID, group ID, house listing, and default shell. Linux customers may get this data by viewing the contents of the /and many others/passwd file and performing some grep, sed, and awk magic.

osquery>
osquery> choose uid,gid,listing,shell,uuid FROM customers LIMIT 7;
+-----+-----+----------------+----------------+------+
| uid | gid | listing      | shell          | uuid |
+-----+-----+----------------+----------------+------+
| Zero   | Zero   | /root          | /bin/bash      |      |
| 1   | 1   | /bin           | /sbin/nologin  |      |
| 2   | 2   | /sbin          | /sbin/nologin  |      |
| three   | Four   | /var/adm       | /sbin/nologin  |      |
| Four   | 7   | /var/spool/lpd | /sbin/nologin  |      |
| 5   | Zero   | /sbin          | /bin/sync      |      |
| 6   | Zero   | /sbin          | /sbin/shutdown |      |
+-----+-----+----------------+----------------+------+
osquery>

Run queries with out coming into interactive mode

What if you wish to run a question with out coming into the osqueri interactive mode? This may very well be very helpful if you’re writing shell scripts round it. In this case, you possibly can echo the SQL question and pipe it to osqueri proper from the Bash shell:

$ echo "select uid,gid,directory,shell,uuid FROM users LIMIT 7;" | osqueryi
+-----+-----+----------------+----------------+------+
| uid | gid | listing      | shell          | uuid |
+-----+-----+----------------+----------------+------+
| Zero   | Zero   | /root          | /bin/bash      |      |
| 1   | 1   | /bin           | /sbin/nologin  |      |
| 2   | 2   | /sbin          | /sbin/nologin  |      |
| three   | Four   | /var/adm       | /sbin/nologin  |      |
| Four   | 7   | /var/spool/lpd | /sbin/nologin  |      |
| 5   | Zero   | /sbin          | /bin/sync      |      |
| 6   | Zero   | /sbin          | /sbin/shutdown |      |
+-----+-----+----------------+----------------+------+
$

Learn what providers begin when booting up

Osquery can even return all of the providers set to start out at boot. For instance, to question the startup_items desk and get the identify, standing, and path of the primary 5 providers that run at startup:

osquery> SELECT identify,sort,standing,path FROM startup_items LIMIT 5;
  identify = README
  sort = Startup Item
standing = enabled
  path = /and many others/rc.d/init.d/README

  identify = anamon
  sort = Startup Item
standing = enabled
  path = /and many others/rc.d/init.d/anamon

  identify = features
  sort = Startup Item
standing = enabled
  path = /and many others/rc.d/init.d/features

  identify = osqueryd
  sort = Startup Item
standing = enabled
  path = /and many others/rc.d/init.d/osqueryd

  identify = AT-SPI D-Bus Bus
  sort = Startup Item
standing = enabled
  path = /usr/libexec/at-spi-bus-launcher --launch-immediately
osquery>

Look up ELF data for a binary

Imagine you wish to discover out extra particulars concerning the ls binary. Usually, you’d do it with the readelf -h command adopted by the ls command’s path. You can question the elf_info desk with Osquery and get the identical data:

osquery> SELECT * FROM elf_info WHERE path="/bin/ls";
      class = 64
        abi = sysv
abi_version = Zero
       sort = dyn
    machine = 62
    model = 1
      entry = 24064
      flags = Zero
       path = /bin/ls
osquery>

Now you’ve gotten a style of how one can use osqueri to search for data of curiosity to you. However, this data is saved on an enormous variety of tables; one system I queried had 156 totally different tables, which will be overwhelming:

$ echo ".tables" | osqueryi | wc -l
156
$

To make issues simpler, you can begin with these tables to get details about your Linux system:

System data desk

osquery> choose * from system_info;

System restrict data

osquery> choose * from ulimit_info;

Files opened by varied processes

osquery> choose * from process_open_files;

Open ports on a system

osquery> choose * from listening_ports;

Running processes data

osquery> choose * from processes;

Installed packages data

osquery> choose * from rpm_packages;

User login data

osquery> choose * from final;

System log data

osquery> choose * from syslog_events;

Learn extra

Osquery is a strong software that gives a whole lot of host data that can be utilized to resolve varied use instances. You can study extra about Osquery by reading its documentation.

Most Popular

To Top