Linux affords a whole lot of instructions to assist customers collect details about their host working system: itemizing information or directories to verify attributes; querying to see what packages are put in, processes are operating, and providers begin at boot; or studying concerning the system’s hardware.
Each command makes use of its personal output format to listing this data. You want to make use of instruments like grep
, sed
, and awk
to filter the outcomes to seek out particular data. Also, a whole lot of this data modifications incessantly, resulting in modifications within the system’s state.
It can be useful to view all of this data formatted just like the output of a database SQL question. Imagine that you possibly can question the output of the ps
and rpm
instructions as in the event you had been querying an SQL database desk with related names.
Fortunately, there’s a software that does simply that and far more: Osquery is an open source “SQL powered operating system instrumentation, monitoring, and analytics framework.”
Many functions that deal with safety, DevOps, compliance, and stock administration (to call just a few) depend on the core functionalities offered by Osquery at their coronary heart.
Install Osquery
Osquery is obtainable for Linux, macOS, Windows, and FreeBSD. Install the most recent model in your working system by following its installation instructions. (I will use model Four.7.Zero in these examples.)
After set up, confirm it is working:
$ rpm -qa | grep osquery
osquery-Four.7.Zero-1.linux.x86_64
$
$ osqueryi --version
osqueryi model Four.7.Zero
$
Osquery parts
Osquery has two principal parts:
- osqueri is an interactive SQL question console. It is a standalone utility that doesn’t want super-user privileges (until you’re querying tables that want that degree of entry).
- osqueryd is sort of a monitoring daemon for the host it’s put in on. This daemon can schedule queries to execute at common intervals to collect data from the infrastructure.
You can run the osqueri utility with out having the osqueryd daemon operating. Another utility, osqueryctl, controls beginning, stopping, and checking the standing of the daemon.
$ rpm -ql osquery-Four.eight.Zero-1.linux.x86_64 | grep bin
/usr/bin/osqueryctl
/usr/bin/osqueryd
/usr/bin/osqueryi
$
Use the osqueryi interactive immediate
You work together with Osquery very like you’d use an SQL database. In reality, osqueryi is a modified model of the SQLite shell. Running the osqueryi
command drops you into an interactive shell the place you’ll be able to run instructions particular to Osquery, which regularly begin with a .
:
$ osqueryi
Using a digital database. Need assist, sort '.assist'
osquery>
To stop the interactive shell, run the .stop
command to get again to the working system’s shell:
Find out what tables can be found
As talked about, Osquery makes information accessible because the output of SQL queries. Information in databases is usually saved in tables. But how will you question these tables if you do not know their names? Well, you’ll be able to run the .tables
command to listing all of the tables which you could question. If you’re a long-time Linux consumer or a sysadmin, the desk names will likely be acquainted, as you’ve gotten been utilizing working system instructions to get this data:
osquery> .tables
=> acpi_tables
=> apparmor_events
=> apparmor_profiles
=> apt_sources<< snip >>
=> arp_cache
=> user_ssh_keys
=> customers
=> yara
=> yara_events
=> ycloud_instance_metadata
=> yum_sources
osquery>
Check the schema for particular person tables
Now that you already know the desk names, you’ll be able to see what data every desk supplies. As an instance, select processes
, because the ps
command is used very often to get this data. Run the .schema
command adopted by the desk identify to see what data is saved on this desk. If you wish to verify the outcomes, you possibly can rapidly run ps -ef
or ps aux
and evaluate the output with the contents of the desk:
osquery> .schema processes
CREATE TABLE processes(`pid` BIGINT, `identify` TEXT, `path` TEXT, `cmdline` TEXT, `state` TEXT, `cwd` TEXT, `root` TEXT, `uid` BIGINT, `gid` BIGINT, `euid` BIGINT, `egid` BIGINT, `suid` BIGINT, `sgid` BIGINT, `on_disk` INTEGER, `wired_size` BIGINT, `resident_size` BIGINT, `total_size` BIGINT, `user_time` BIGINT, `system_time` BIGINT, `disk_bytes_read` BIGINT, `disk_bytes_written` BIGINT, `start_time` BIGINT, `mum or dad` BIGINT, `pgroup` BIGINT, `threads` INTEGER, `good` INTEGER, `is_elevated_token` INTEGER HIDDEN, `elapsed_time` BIGINT HIDDEN, `handle_count` BIGINT HIDDEN, `percent_processor_time` BIGINT HIDDEN, `upid` BIGINT HIDDEN, `uppid` BIGINT HIDDEN, `cpu_type` INTEGER HIDDEN, `cpu_subtype` INTEGER HIDDEN, `phys_footprint` BIGINT HIDDEN, PRIMARY KEY (`pid`)) WITHOUT ROWID;
osquery>
To drive house the purpose, use the next command to see the schema for the RPM packages and evaluate the data with rpm -qa
and rpm -qi
working system instructions:
osquery>
osquery> .schema rpm_packages
CREATE TABLE rpm_packages(`identify` TEXT, `model` TEXT, `launch` TEXT, `supply` TEXT, `measurement` BIGINT, `sha1` TEXT, `arch` TEXT, `epoch` INTEGER, `install_time` INTEGER, `vendor` TEXT, `package_group` TEXT, `pid_with_namespace` INTEGER HIDDEN, `mount_namespace_id` TEXT HIDDEN, PRIMARY KEY (`identify`, `model`, `launch`, `arch`, `epoch`, `pid_with_namespace`)) WITHOUT ROWID;
osquery>
You study extra in Osquery’s tables documentation.
Use the PRAGMA command
In case that schema data is just too cryptic for you, there may be one other strategy to print the desk data in a verbose, tabular format: the PRAGMA
command. For instance, I will use PRAGMA
to see data for the rpm_packages
desk in a pleasant format:
osquery> PRAGMA table_info(rpm_packages);
One advantage of this tabular data is which you could deal with the sector you wish to question and see the kind of data that it supplies:
osquery> PRAGMA table_info(customers);
+-----+-------------+--------+---------+------------+----+
| cid | identify | sort | notnull | dflt_value | pk |
+-----+-------------+--------+---------+------------+----+
| Zero | uid | BIGINT | 1 | | 1 |
| 1 | gid | BIGINT | Zero | | Zero |
| 2 | uid_signed | BIGINT | Zero | | Zero |
| three | gid_signed | BIGINT | Zero | | Zero |
| Four | username | TEXT | 1 | | 2 |
| 5 | description | TEXT | Zero | | Zero |
| 6 | listing | TEXT | Zero | | Zero |
| 7 | shell | TEXT | Zero | | Zero |
| eight | uuid | TEXT | 1 | | three |
+-----+-------------+--------+---------+------------+----+
osquery>
Run your first question
Now that you’ve all of the required data from the desk, the schema, and the objects to question, run your first SQL question to view the data. The question under returns the customers which are current on the system and every one’s consumer ID, group ID, house listing, and default shell. Linux customers may get this data by viewing the contents of the /and many others/passwd
file and performing some grep
, sed
, and awk
magic.
osquery>
osquery> choose uid,gid,listing,shell,uuid FROM customers LIMIT 7;
+-----+-----+----------------+----------------+------+
| uid | gid | listing | shell | uuid |
+-----+-----+----------------+----------------+------+
| Zero | Zero | /root | /bin/bash | |
| 1 | 1 | /bin | /sbin/nologin | |
| 2 | 2 | /sbin | /sbin/nologin | |
| three | Four | /var/adm | /sbin/nologin | |
| Four | 7 | /var/spool/lpd | /sbin/nologin | |
| 5 | Zero | /sbin | /bin/sync | |
| 6 | Zero | /sbin | /sbin/shutdown | |
+-----+-----+----------------+----------------+------+
osquery>
Run queries with out coming into interactive mode
What if you wish to run a question with out coming into the osqueri interactive mode? This may very well be very helpful if you’re writing shell scripts round it. In this case, you possibly can echo
the SQL question and pipe it to osqueri proper from the Bash shell:
$ echo "select uid,gid,directory,shell,uuid FROM users LIMIT 7;" | osqueryi
+-----+-----+----------------+----------------+------+
| uid | gid | listing | shell | uuid |
+-----+-----+----------------+----------------+------+
| Zero | Zero | /root | /bin/bash | |
| 1 | 1 | /bin | /sbin/nologin | |
| 2 | 2 | /sbin | /sbin/nologin | |
| three | Four | /var/adm | /sbin/nologin | |
| Four | 7 | /var/spool/lpd | /sbin/nologin | |
| 5 | Zero | /sbin | /bin/sync | |
| 6 | Zero | /sbin | /sbin/shutdown | |
+-----+-----+----------------+----------------+------+
$
Learn what providers begin when booting up
Osquery can even return all of the providers set to start out at boot. For instance, to question the startup_items
desk and get the identify, standing, and path of the primary 5 providers that run at startup:
osquery> SELECT identify,sort,standing,path FROM startup_items LIMIT 5;
identify = README
sort = Startup Item
standing = enabled
path = /and many others/rc.d/init.d/READMEidentify = anamon
sort = Startup Item
standing = enabled
path = /and many others/rc.d/init.d/anamonidentify = features
sort = Startup Item
standing = enabled
path = /and many others/rc.d/init.d/featuresidentify = osqueryd
sort = Startup Item
standing = enabled
path = /and many others/rc.d/init.d/osquerydidentify = AT-SPI D-Bus Bus
sort = Startup Item
standing = enabled
path = /usr/libexec/at-spi-bus-launcher --launch-immediately
osquery>
Look up ELF data for a binary
Imagine you wish to discover out extra particulars concerning the ls
binary. Usually, you’d do it with the readelf -h
command adopted by the ls
command’s path. You can question the elf_info
desk with Osquery and get the identical data:
osquery> SELECT * FROM elf_info WHERE path="/bin/ls";
class = 64
abi = sysv
abi_version = Zero
sort = dyn
machine = 62
model = 1
entry = 24064
flags = Zero
path = /bin/ls
osquery>
Now you’ve gotten a style of how one can use osqueri to search for data of curiosity to you. However, this data is saved on an enormous variety of tables; one system I queried had 156 totally different tables, which will be overwhelming:
$ echo ".tables" | osqueryi | wc -l
156
$
To make issues simpler, you can begin with these tables to get details about your Linux system:
System data desk
osquery> choose * from system_info;
System restrict data
osquery> choose * from ulimit_info;
Files opened by varied processes
osquery> choose * from process_open_files;
Open ports on a system
osquery> choose * from listening_ports;
Running processes data
osquery> choose * from processes;
Installed packages data
osquery> choose * from rpm_packages;
User login data
osquery> choose * from final;
System log data
osquery> choose * from syslog_events;
Learn extra
Osquery is a strong software that gives a whole lot of host data that can be utilized to resolve varied use instances. You can study extra about Osquery by reading its documentation.