Translate C++/Rust type into C type with the same memory layout

Overview

clayout, translate C++/Rust type into C type with the same memory layout. Generally, clayout is used together with bpftrace.

clayout is developed on ddbug. THANKS FOR ddbug!

Usage

Imagine a scenario where you want to use bpftrace to track the value of S::x during the running of the following program.

#include <stdio.h>
#include <unistd.h>

struct X {
  virtual ~X() {}

  int x1;
};

struct S : public X {
  S() : x(0) {}

  S(const S &other) : x(other.x) {}

  S f(int y, int z) {
    printf("output from a.out: this.x=%d y=%d z=%d\n", x, y, z);
    x += (y + z);
    return *this;
  }

  int x;
};

int main(int argc, char **argv) {
  S s;
  int i = 0;
  while (1) {
    s.f(i, i);
    ++i;
    sleep(1);
    // break;
  }
  return 0;
}

clayout can translate S into a C structure with the same memory layout:

# clayout will generate struct.h, struct.c
$ clayout -i ${binary path} -o struct S
// struct.h
// Generated by hidva/clayout! 大吉大利!
#pragma once
#include <linux/types.h>
struct HidvaStruct2 {
  void** __mem1;
  int x1;
} __attribute__((__packed__));


struct S {
  struct HidvaStruct2 __parent0;
  int x;
} __attribute__((__packed__));

So you can easily write the following bpftrace script:

#include "struct.h"

u:/apsara/zhanyi.ww/tmp/bphtrace/x/trace:_ZN1S1fEii {
  printf("output from bpftrace: ret=%p this.x=%d y=%d z=%d\n", (int32*)arg0, ((struct S*)arg1)->x, arg2, arg3)
}
$ bpftrace  -c ./trace t.bt
Attaching 1 probe...
output from a.out: this.x=0 y=0 z=0
output from bpftrace: ret=0x7ffff3044610 this.x=0 y=0 z=0
output from a.out: this.x=0 y=1 z=1
output from bpftrace: ret=0x7ffff3044610 this.x=0 y=1 z=1

Please note that you may intuitively think that the layout of S is as follows:

struct X {
  void** __mem1;
  int x1;
}

struct S {
  struct X __parent0;
  int x;
}

But actually it is wrong! S::x will reuse the padding part of X in C++!

multi input

clayout supports multiple input files, and type references across files.

// x.h
struct X {
  virtual ~X();

  int x1;
};

struct S : public X {
  S();

  S(const S &other);

  S f(int y, int z);

  int x;
};

// X.cc
#include <stdio.h>
#include "x.h"

X::~X() {}

S::S(): x(0) {}

S::S(const S &other) : x(other.x) {}

S S::f(int y, int z) {
  printf("output from a.out: this.x=%d y=%d z=%d\n", x, y, z);
  x += (y + z);
  return *this;
}

// trace.cc
#include <unistd.h>
#include "x.h"

int main(int argc, char **argv) {
  S s;
  int i = 0;
  while (1) {
    s.f(i, i);
    ++i;
    sleep(1);
  }
  return 0;
}
$ clang++ -fPIC -shared -g -O0 X.cc -o libzh_x.so
$ clang++ -g -O0 trace.cc -o trace -L. -lzh_x

Because of -fstandalone-debug, the trace binary file does not contain any debugging information of X:

$ readelf --debug-dump=info trace
 <1><fc>: Abbrev Number: 13 (DW_TAG_structure_type)
    <fd>   DW_AT_name        : X
    <ff>   DW_AT_declaration : 1

Because there is no debugging information of X in the trace binary file, a placeholder __u8 __unknown_type1[12] is used.

$ clayout -i trace -o output S
// output.h
// Generated by hidva/clayout! 大吉大利!
#pragma once
#include <linux/types.h>

struct S {
  __u8 __unknown_type1[12];
  int x;
} __attribute__((__packed__));

We can use multi input file to get the detail of X:

$ clayout -i trace -i libzh_x.so -o output S
// output.h
// Generated by hidva/clayout! 大吉大利!
#pragma once
#include <linux/types.h>
struct HidvaStruct2 {
  void** __mem1;
  int x1;
} __attribute__((__packed__));


struct S {
  struct HidvaStruct2 __parent0;
  int x;
} __attribute__((__packed__));
You might also like...
High-performance QEMU memory and instruction tracing
High-performance QEMU memory and instruction tracing

Cannoli Cannoli is a high-performance tracing engine for qemu-user. It can record a trace of both PCs executed, as well as memory operations. It consi

Mononym is a library for creating unique type-level names for each value in Rust.

Mononym is a library for creating unique type-level names for each value in Rust.

CBOR (binary JSON) for Rust with automatic type based decoding and encoding.

THIS PROJECT IS UNMAINTAINED. USE serde_cbor INSTEAD. This crate provides an implementation of RFC 7049, which specifies Concise Binary Object Represe

Error context library with support for type-erased sources and backtraces, targeting full support of all features on stable Rust

Error context library with support for type-erased sources and backtraces, targeting full support of all features on stable Rust, and with an eye towards serializing runtime errors using serde.

An annotated string type in Rust, made up of string slices

A string type made up of multiple annotated string slices.

A tuple crate for Rust, which introduces a tuple type represented in recusive form.

tuplez This crate introduces a tuple type represented in recursive form rather than parallel form. Motivation The primitive tuple types are represente

Type-check non-existing `Phantom` code for Fun And Profit
Type-check non-existing `Phantom` code for Fun And Profit

Sometimes you may want to write Rust code that ought to be type-checked (e.g., borrow-checked) in the same fashion as real Rust code even though that code is never intended to be run / to affect or even reach code generation.

Type-safe IPC for Tauri using GraphQL

Tauri Plugin graphql A plugin for Tauri that enables type-safe IPC through GraphQL. Install Rust [dependencies] tauri-plugin-graphql = "0.2" JavaScrip

The most fundamental type for async synchronization: an intrusive linked list of futures

wait-list This crate provides WaitList, the most fundamental type for async synchronization. WaitList is implemented as an intrusive linked list of fu

Comments
  • pretty type name

    pretty type name

    union u {
    	long u_l;
    	char u_c;
    	long u_b: 2;
    	S1 s1;
    };
    

    will be translated to

    struct HidvaStruct1 {  // #2
      long int l;
      char ch;
    } __attribute__((__packed__));
    
    
    // tyidx=TypeIndex { input_id: 0, typoff: TypeOffset(92) } tyname=::u
    union HidvaUnion3 {  // #1
      long int u_l;
      char u_c;
      __u8 __bitfield0[1];  
      struct HidvaStruct1 s1;  // #2
      __u8 __HIDVA_dont_use2[16];
    } __attribute__((__packed__));
    
    opened by hidva 0
  • duplicate type debuginfo

    duplicate type debuginfo

    // tyname=::std::_Rb_tree_node_base tyidx=TypeIndex { input_id: 0, typoff: TypeOffset(60984357) }
    // tyname=::std::_Rb_tree_node_base tyidx=TypeIndex { input_id: 0, typoff: TypeOffset(28125307) }
    // tyname=::std::_Rb_tree_node_base tyidx=TypeIndex { input_id: 0, typoff: TypeOffset(69763994) }
    // tyname=::std::_Rb_tree_node_base tyidx=TypeIndex { input_id: 0, typoff: TypeOffset(22768814) }
    // tyname=::std::_Rb_tree_node_base tyidx=TypeIndex { input_id: 0, typoff: TypeOffset(56430823) }
    // tyname=::std::_Rb_tree_node_base tyidx=TypeIndex { input_id: 0, typoff: TypeOffset(31413305) }
    
    $ grep -F 'char_Array16' 12.h  |grep -F typedef
    typedef char char_Array16[16];
    typedef char char_Array16_1[16];
    typedef char char_Array16_2[16];
    typedef char char_Array16_3[16];
    typedef char char_Array16_4[16];
    typedef char char_Array16_5[16];
    typedef char char_Array16_6[16];
    typedef char char_Array16_7[16];
    typedef char char_Array16_8[16];
    typedef char char_Array16_9[16];
    typedef char char_Array16_10[16];
    typedef char char_Array16_11[16];
    typedef char char_Array16_12[16];
    
    opened by hidva 0
  • bitfield support

    bitfield support

    union u {
    	long u_l;
    	char u_c;
    	long u_b: 2;
    	S1 s1;
    };
    

    will be translated to:

    union HidvaUnion3 {
      long int u_l;
      char u_c;
      __u8 __bitfield0[1];   // #1
      struct HidvaStruct1 s1;
      __u8 __HIDVA_dont_use2[16];
    } __attribute__((__packed__));
    
    opened by hidva 0
Owner
盏一
盏一
A memory efficient immutable string type that can store up to 24* bytes on the stack

compact_str A memory efficient immutable string type that can store up to 24* bytes on the stack. * 12 bytes for 32-bit architectures About A CompactS

Parker Timmerman 342 Jan 2, 2023
Rust binding of fortran Limited memory LBFGS subroutine

lbfgs-sys Rust binding of fortran L-BFGS-B subroutine. The orginal fortran subroutine is distributed under BSD-3 license. To know more about the condi

Naushad Karim 10 Sep 23, 2022
This crate allows writing a struct in Rust and have it derive a struct of arrays layed out in memory according to the arrow format.

Arrow2-derive - derive for Arrow2 This crate allows writing a struct in Rust and have it derive a struct of arrays layed out in memory according to th

Jorge Leitao 29 Dec 27, 2022
A rust library that makes reading and writing memory of the Dolphin emulator easier.

dolphin-memory-rs A crate for reading from and writing to the emulated memory of Dolphin in rust. A lot of internals here are directly based on aldela

Madison Barry 4 Jul 19, 2022
A memory visualizer in Rust (ptrace + userfaultfd)

mevi A memory visualizer for Linux 5.7+ Made for this video: https://www.youtube.com/watch?v=DpnXaNkM9_M Prerequisite The vm.unprivileged_userfaultfd

amos 519 Apr 17, 2023
bustd is a lightweight process killer daemon for out-of-memory scenarios for Linux!

bustd: Available memory or bust! bustd is a lightweight process killer daemon for out-of-memory scenarios for Linux! Features Small memory usage! bust

Pop!_OS 8 Oct 6, 2022
A memory profiler for Linux.

Bytehound - a memory profiler for Linux Features Can be used to analyze memory leaks, see where exactly the memory is being consumed, identify tempora

Koute 3.3k Dec 25, 2022
messloc is a drop in replacement for malloc that can transparently recover from memory fragmentation without any changes to application code.

messloc is a drop in replacement for malloc that can transparently recover from memory fragmentation without any changes to application code. Goals Al

null 11 Dec 10, 2022
Allocscope - a memory tracking tool

allocscope a memory tracking tool allocscope is a tool for tracking down where the most egregiously large allocations are occurring in a C, C++ or Rus

Matt Kimball 489 Feb 22, 2023
A memory-based evasion technique which makes shellcode invisible from process start to end.

phantom A memory-based evasion technique which makes shellcode invisible from process start to end. Motivation ShellGhost Offensive Edition, and rust!

B1-TEAM 5 Sep 15, 2023