Science and technology

Introducing Rust calls to C library features

Why name C features from Rust? The brief reply is software program libraries. An extended reply touches on the place C stands amongst programming languages basically and in the direction of Rust specifically. C, C++, and Rust are techniques languages, which give programmers entry to machine-level knowledge sorts and operations. Among these three techniques languages, C stays the dominant one. The kernels of contemporary working techniques are written primarily in C, with meeting language accounting for the remaining. The customary system libraries for enter and output, quantity crunching, cryptography, safety, networking, internationalization, string processing, reminiscence administration, and extra, are likewise written principally in C. These libraries characterize an unlimited infrastructure for functions written in every other language. Rust is properly alongside the best way to offering fantastic libraries of its personal, however C libraries—​round for the reason that Nineteen Seventies and nonetheless rising—​are a useful resource to not be ignored. Finally, C remains to be the lingua franca amongst programming languages: most languages can discuss to C and, by means of C, to every other language that does so.

Two proof-of-concept examples

Rust has an FFI (Foreign Function Interface) that helps calls to C features. An concern for any FFI is whether or not the calling language covers the info sorts within the referred to as language. For instance, ctypes is an FFI for calls from Python into C, however Python does not cowl the unsigned integer sorts accessible in C. As a consequence, ctypes should resort to workarounds.

By distinction, Rust covers all of the primitive (that’s, machine-level) sorts in C. For instance, the Rust i32 sort matches the C int sort. C specifies solely that the char sort should be one byte in measurement and different sorts, equivalent to int, should be no less than this measurement; however these days each affordable C compiler helps a four-byte int, an eight-byte double (in Rust, the f64 sort), and so forth.

There is one other problem for an FFI directed at C: Can the FFI deal with C’s uncooked pointers, together with tips to arrays that depend as strings in C? C doesn’t have a string sort, however moderately implements strings as character arrays with a non-printing terminating character, the null terminator of legend. By distinction, Rust has two string sorts: String and &str (string slice). The query, then, is whether or not the Rust FFI can remodel a C string right into a Rust one—​and the reply is sure.

Pointers to buildings are also frequent in C. The cause is effectivity. By default, a C construction is handed by worth (that’s, by a byte-per-byte copy) when a construction is both an argument handed to a perform or a price returned from one. C buildings, like their Rust counterparts, can embrace arrays and nest different buildings and so be arbitrarily massive in measurement. Best observe in both language is to go and return buildings by reference, that’s, by passing or returning the construction’s handle moderately than a replica of the construction. Once once more, the Rust FFI is as much as the duty of dealing with C tips to buildings, that are frequent in C libraries.

The first code instance focuses on calls to comparatively easy C library features equivalent to abs (absolute worth) and sqrt (sq. root). These features take non-pointer scalar arguments and return a non-pointer scalar worth. The second code instance, which covers strings and tips to buildings, introduces the bindgen utility, which generates Rust code from C interface (header) information equivalent to math.h and time.h. C header information specify the calling syntax for C features and outline buildings utilized in such calls. The two code examples are available on my homepage.

Calling comparatively easy C features

The first code instance has 4 Rust calls to C features in the usual arithmetic library: one name apiece to abs (absolute worth) and pow (exponentiation), and two calls to sqrt (sq. root). The program may be constructed immediately with the rustc compiler, or extra conveniently with the cargo construct command:

use std::os::uncooked::c_int;    // 32 bits
use std::os::uncooked::c_double; // 64 bits

// Import three features from the usual library libc.
// Here are the Rust declarations for the C features:
extern "C" {
    fn abs(num: c_int) -> c_int;
    fn sqrt(num: c_double) -> c_double;
    fn pow(num: c_double, energy: c_double) -> c_double;
}

fn predominant() {
    let x: i32 = -123;
    println!("nAbsolute value of {x}: {}.",
             unsafe { abs(x) });

    let n: f64 = 9.0;
    let p: f64 = 3.0;
    println!("n{n} raised to {p}: {}.",
             unsafe { pow(n, p) });

    let mut y: f64 = 64.0;
    println!("nSquare root of {y}: {}.",
             unsafe { sqrt(y) });
    y = -3.14;
    println!("nSquare root of {y}: {}.",
             unsafe { sqrt(y) }); //** NaN = NotaNumber
}

The two use declarations on the high are for the Rust knowledge sorts c_int and c_double, which match the C sorts int and double, respectively. The customary Rust module std::os::uncooked defines fourteen such sorts for C compatibility. The module std::ffi has the identical fourteen sort definitions along with help for strings.

The extern "C" block above the predominant perform then declares the three C library features referred to as within the predominant perform under. Each name makes use of the usual C perform’s identify, however every name should happen inside an unsafe block. As each programmer new to Rust discovers, the Rust compiler enforces reminiscence security with a vengeance. Other languages (specifically, C and C++) don’t make the identical ensures. The unsafe block thus says: Rust takes no duty for no matter unsafe operations might happen within the exterior name.

The first program’s output is:

Absolute worth of -123: 123.
9 raised to three: 729
Square root of 64: 8.
Square root of -3.14: NaN.

In the final output line, the NaN stands for Not a Number: the C sqrt library perform expects a non-negative worth as its argument, which signifies that the argument -3.14 generates NaN because the returned worth.

Programming and improvement

Calling C features involving pointers

C library features in safety, networking, string processing, reminiscence administration, and different areas repeatedly use pointers for effectivity. For instance, the library perform asctime (time as an ASCII string) expects a pointer to a construction as its single argument. A Rust name to a C perform equivalent to asctime is thus trickier than a name to sqrt, which includes neither pointers nor buildings.

The C construction for the asctime perform name is of sort struct tm. A pointer to such a construction is also handed to library perform mktime (make a time worth). The construction breaks a time into items such because the yr, the month, the hour, and so forth. The construction’s fields are of sort time_t, an alias for for both int (32 bits) or lengthy (64 bits). The two library features mix these broken-apart time items right into a single worth: asctime returns a string illustration of the time, whereas mktime returns a time_t worth that represents the variety of elapsed seconds for the reason that epoch, which is a time relative to which a system’s clock and timestamp are decided. Typical epoch settings are January 1 00:00:00 (zero hours, minutes, and seconds) of both 1900 or 1970.

The C program under calls asctime and mktime, and makes use of one other library perform strftime to transform the mktime returned worth right into a formatted string. This program acts as a warm-up for the Rust model:

#embrace <stdio.h>
#embrace <time.h>

int predominant () {
  struct tm someday;  /* time damaged out intimately */
  char buffer[80];
  int utc;

  someday.tm_sec = 1;
  someday.tm_min = 1;
  someday.tm_hour = 1;
  someday.tm_mday = 1;
  someday.tm_mon = 1;
  someday.tm_year = 1;
  someday.tm_hour = 1;
  someday.tm_wday = 1;
  someday.tm_yday = 1;

  printf("Date and time: %sn", asctime(&someday));

  utc = mktime(&someday);
  if( utc < 0 ) {
    fprintf(stderr, "Error: unable to make time using mktimen");
  } else {
    printf("The integer value returned: %dn", utc);
    strftime(buffer, sizeof(buffer), "%c", &someday);
    printf("A more readable version: %sn", buffer);
  }

  return 0;
}

The program outputs:

Date and time: Fri Feb  1 01:01:01 1901
The integer worth returned: 2120218157
A extra readable model: Fri Feb  1 01:01:01 1901

In abstract, the Rust calls to library features asctime and mktime should cope with two points:

Rust calls to asctime and mktime

The bindgen utility generates Rust help code from C header information equivalent to math.h and time.h. In this instance, a simplified model of time.h will do however with two modifications from the unique:

  • The built-in sort int is used as an alternative of the alias sort time_t. The bindgen utility can deal with the time_t sort however generates some distracting warnings alongside the best way as a result of time_t doesn’t comply with Rust naming conventions: in time_t an underscore separates the t on the finish from the time that comes first; Rust would like a CamelCase identify equivalent to TimeT.

  • The sort struct tm sort is given StructTM as an alias for a similar cause.

Here is the simplified header file with declarations for mktime and asctime on the backside:

typedef struct tm {
    int tm_sec;    /* seconds */
    int tm_min;    /* minutes */
    int tm_hour;   /* hours */
    int tm_mday;   /* day of the month */
    int tm_mon;    /* month */
    int tm_year;   /* yr */
    int tm_wday;   /* day of the week */
    int tm_yday;   /* day within the yr */
    int tm_isdst;  /* daylight saving time */
} StructTM;

extern int mktime(StructTM*);
extern char* asctime(StructTM*);

With bindgen put in, % because the command-line immediate, and mytime.h because the header file above, the next command generates the required Rust code and saves it within the file mytime.rs:

% bindgen mytime.h > mytime.rs

Here is the related a part of mytime.rs:

/* mechanically generated by rust-bindgen 0.61.0 */

#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct tm {
    pub tm_sec: ::std::os::uncooked::c_int,
    pub tm_min: ::std::os::uncooked::c_int,
    pub tm_hour: ::std::os::uncooked::c_int,
    pub tm_mday: ::std::os::uncooked::c_int,
    pub tm_mon: ::std::os::uncooked::c_int,
    pub tm_year: ::std::os::uncooked::c_int,
    pub tm_wday: ::std::os::uncooked::c_int,
    pub tm_yday: ::std::os::uncooked::c_int,
    pub tm_isdst: ::std::os::uncooked::c_int,
}

pub sort StructTM = tm;

extern "C" {
    pub fn mktime(arg1: *mut StructTM) -> ::std::os::uncooked::c_int;
}

extern "C" {
    pub fn asctime(arg1: *mut StructTM) -> *mut ::std::os::uncooked::c_char;
}

#[test]
fn bindgen_test_layout_tm() {
    const UNINIT: ::std::mem::MaybeUninit<tm> =
       ::std::mem::MaybeUninit::uninit();
    let ptr = UNINIT.as_ptr();
    assert_eq!(
        ::std::mem::size_of::<tm>(),
        36usize,
        concat!("Size of: ", stringify!(tm))
    );
    ...

The Rust construction struct tm, just like the C unique, incorporates 9 4-byte integer fields. The discipline names are the identical in C and Rust. The extern "C" blocks declare the library features asctime and mktime as taking one argument apiece, a uncooked pointer to a mutable StructTM occasion. (The library features might mutate the construction through the pointer handed as an argument.)

The remaining code, below the #[test] attribute, assessments the structure of the Rust model of the time construction. The take a look at may be run with the cargo take a look at command. At concern is that C doesn’t specify how the compiler should lay out the fields of a construction. For instance, the C struct tm begins out with the sphere tm_sec for the second; however C doesn’t require that the compiled model has this discipline as the primary. In any case, the Rust assessments ought to succeed and the Rust calls to the library features ought to work as anticipated.

Getting the second instance up and working

The code generated from bindgen doesn’t embrace a predominant perform and, subsequently, is a pure module. Below is the predominant perform with the StructTM initialization and the calls to asctime and mktime:

mod mytime;
use mytime::*;
use std::ffi::CStr;

fn predominant() {
    let mut someday  = StructTM {
        tm_year: 1,
        tm_mon: 1,
        tm_mday: 1,
        tm_hour: 1,
        tm_min: 1,
        tm_sec: 1,
        tm_isdst: -1,
        tm_wday: 1,
        tm_yday: 1
    };

    unsafe {
        let c_ptr = &mut someday; // uncooked pointer

        // make the decision, convert after which personal
        // the returned C string
        let char_ptr = asctime(c_ptr);
        let c_str = CStr::from_ptr(char_ptr);
        println!("{:#?}", c_str.to_str());

        let utc = mktime(c_ptr);
        println!("{}", utc);
    }
}

The Rust code may be compiled (utilizing both rustc immediately or cargo) after which run. The output is:

Ok(
    "Mon Feb  1 01:01:01 1901n",
)
2120218157

The calls to the C features asctime and mktime once more should happen inside an unsafe block, because the Rust compiler can’t be held accountable for any memory-safety mischief in these exterior features. For the document, asctime and mktime are properly behaved. In the calls to each features, the argument is the uncooked pointer ptr, which holds the (stack) handle of the someday construction.

The name to asctime is the trickier of the 2 calls as a result of this perform returns a pointer to a C char, the character M in Mon of the textual content output. Yet the Rust compiler doesn’t know the place the C string (the null-terminated array of char) is saved. In the static space of reminiscence? On the heap? The array utilized by the asctime perform to retailer the textual content illustration of the time is, in actual fact, within the static space of reminiscence. In any case, the C-to-Rust string conversion is finished in two steps to keep away from compile-time errors:

  1. The name Cstr::from_ptr(char_ptr) converts the C string to a Rust string and returns a reference saved within the c_str variable.

  2. The name to c_str.to_str() ensures that c_str is the proprietor.

The Rust code doesn’t generate a human-readable model of the integer worth returned from mktime, which is left as an train for the . The Rust module chrono::format features a strftime perform, which can be utilized just like the C perform of the identical identify to get a textual content illustration of the time.

Calling C with FFI and bindgen

The Rust FFI and the bindgen utility are properly designed for making Rust calls out to C libraries, whether or not customary or third-party. Rust talks readily to C and thereby to every other language that talks to C. For calling comparatively easy library features equivalent to sqrt, the Rust FFI is easy as a result of Rust’s primitive knowledge sorts cowl their C counterparts.

For extra difficult interchanges—​specifically, Rust calls to C library features equivalent to asctime and mktime that contain buildings and pointers—​the bindgen utility is the best way to go. This utility generates the help code along with acceptable assessments. Of course, the Rust compiler can not assume that C code measures as much as Rust requirements with regards to reminiscence security; therefore, calls from Rust to C should happen in unsafe blocks.

Most Popular

To Top