Skip to content

Commit

Permalink
Fix prefix calculation for AUTO_NO32 ops, fix encoding data for call …
Browse files Browse the repository at this point in the history
…reg, change functionality of the Pointer! macros in dynasmrt, add machinery to extract the list of supported instructions from dynasm's internal encoding data and write a tutorial
  • Loading branch information
CensoredUsername committed Aug 26, 2016
1 parent bea3ac9 commit 43be0cb
Show file tree
Hide file tree
Showing 9 changed files with 218 additions and 11 deletions.
3 changes: 3 additions & 0 deletions build_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ mkdir ./build_docs/language
mkdir ./build_docs/plugin
mkdir ./build_docs/runtime

# create instruction reference markdown file
(cd doc/insref && cargo run > ../instructionref.md)

# build plugin docs
for f in ./doc/*.md; do
rustdoc $f -o ./build_docs/language --markdown-no-toc --html-before-content=./doc/pre.html --html-after-content=./doc/post.html --markdown-css=./formatting.css
Expand Down
1 change: 1 addition & 0 deletions doc/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
instructionref.md
10 changes: 10 additions & 0 deletions doc/insref/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[package]
name = "insref"
version = "0.0.1"
authors = ["CensoredUsername <[email protected]>"]

[dependencies]
itertools = "0.4.*"

[dependencies.dynasm]
path = "../../plugin"
37 changes: 37 additions & 0 deletions doc/insref/src/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
extern crate itertools;

// we generate this list directly from dynasm's internals
#[allow(plugin_as_library)]
extern crate dynasm;

use dynasm::debug;
use dynasm::x64data;

use std::io::{self, Write};
use itertools::Itertools;

fn main() {
let stdout = io::stdout();
let mut stdout = stdout.lock();
stdout.write_all(b"% Instruction Reference\n\n").unwrap();

let mut mnemnonics: Vec<_> = x64data::mnemnonics().cloned().collect();
mnemnonics.sort();

for mnemnonic in mnemnonics {
let data = x64data::get_mnemnonic_data(mnemnonic).unwrap();
let mut formats = data.into_iter()
.map(|x| debug::format_opdata(mnemnonic, x))
.flatten()
.map(|x| x.replace(">>> ", ""))
.collect::<Vec<_>>();
formats.sort();

stdout.write_all(b"### ").unwrap();
stdout.write_all(mnemnonic.as_bytes()).unwrap();
stdout.write_all(b"\n```\n").unwrap();

stdout.write_all(formats.join("\n").as_bytes()).unwrap();
stdout.write_all(b"\n```\n").unwrap();
}
}
152 changes: 151 additions & 1 deletion doc/tutorial.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,153 @@
% Tutorial

Coming soon.
# Introduction

Dynasm-rs is a library and sytnax extension for assembling code at runtime. For the first part of the tutorial we will be examining the following example program that assembles a simple function at runtime:

```
#![feature(plugin)]
#![plugin(dynasm)]
#[macro_use]
extern crate dynasmrt;
use dynasmrt::DynasmApi;
use std::{io, slice, mem};
use std::io::Write;
fn main() {
let mut ops = dynasmrt::Assembler::new();
let string = "Hello World!";
dynasm!(ops
; ->hello:
; .bytes string.as_bytes()
);
let hello = ops.offset();
dynasm!(ops
; lea rcx, [->hello]
; xor edx, edx
; mov dl, BYTE string.len() as _
; mov rax, QWORD print as _
; sub rsp, BYTE 0x28
; call rax
; add rsp, BYTE 0x28
; ret
);
let buf = ops.finalize().unwrap();
let hello_fn: extern "win64" fn() -> bool = unsafe {
mem::transmute(buf.ptr(hello))
};
assert!(
hello_fn()
);
}
pub extern "win64" fn print(buffer: *const u8, length: u64) -> bool {
io::stdout().write_all(unsafe {
slice::from_raw_parts(buffer, length as usize)
}).is_ok()
}
```

We will now examine this code snippet piece by piece.

```
#![feature(plugin)]
#![plugin(dynasm)]
```
To use the dynasm! procedural macro, first the dynasm plugin has to be loaded. As plugins are currently unstable, the plugin feature first needs to be enabled. This currently requires a nightly version of rustc.

```
#[macro_use]
extern crate dynasmrt;
use dynasmrt::DynasmApi;
```
We then link to the dynasm runtime crate. Although they are not used here, it also contains various utility macros which we load here.
Furthermore, the `DynasmApi` trait is loaded. This trait defines the interface used by the `dynasm!` procedural macro to produce assembled code.

```
let mut ops = dynasmrt::Assembler::new();
```
Of course, the machine code that will be generated will need to live somewhere. `dynasmrt::Assembler` is a struct that implements the `DynasmApi` trait, provides storage for the generated machine code, handles memory permissions and provides various utilities for dynamically assembling code. It even allows assembling code in one thread while several other threads execute said code. For this example though, we will use it in the most simple usecase, just assembling everything in advance and then executing it.

```
dynasm!(ops
; ->hello:
; .bytes string.as_bytes()
);
```
The first invocation of the `dynasm!` macro shows of two features of dynasm. The first line defines a global label `hello` which later can be referenced, while the second line contains an assembler directive. Assembler directives allow the assembler to perform tasks that do not involve instruction assembling like, in this case, inserting a string into the executable buffer.

```
let hello = ops.offset();
```
This utility function returns a value indicating the position of the current end of the machine code buffer. It can later be used to obtain a pointer to this position in the generated machine code.


```
dynasm!(ops
; lea rcx, [->hello]
; xor edx, edx
; mov dl, BYTE string.len() as _
; mov rax, QWORD print as _
; sub rsp, BYTE 0x28
; call rax
; add rsp, BYTE 0x28
; ret
);
```
The second invocation of the `dynasm!` macro contains the definition of a small function. It performs the following tasks:

```
; lea rcx, [->hello]
```
First, the address of the global label `->hello` is loaded using the load effective address instruction and a label memory reference.

```
; xor edx, edx
; mov dl, BYTE string.len() as _
```
Then the length of the string is loaded. Here the `BYTE` prefix determines the size of the immediate in the second instruction. the `as _` cast is necessary to coerce the size of the length down to the `i8` type expected of an immediate. Dynasm-rs tries to avoid performing implicit casts as this tends to hide errors.

```
; mov rax, QWORD print as _
; sub rsp, BYTE 0x28
; call rax
; add rsp, BYTE 0x28
```
Here, a call is made from the dynamically assembled code to the rust `print` function. Note the `QWORD` size prefix which is necessary to determine the appropriate form of the `mov` instruction to encode as `dynasm!` does not analyze the immediate expression at runtime. As this example uses the `"win64"` calling convention, the stack pointer needs to be manipulated too. (Note: the `"win64"` calling convention is used as this it is currently impossible to use the `"sysv64"` calling convention on all platforms)

```
; ret
```
And finally the assembled function returns, returning the return value from the `print` function in `rax` back to the caller rust code.

```
let buf = ops.finalize().unwrap();
```
With the assembly completed, we now finalize the `dynasmrt::Assembler`, which will resolve all labels previously used and move the data into a `dynasmrt::ExecutableBuffer`. This struct, which dereferences to a `&[u8]`, wraps a buffer of readable and executable memory.

```
let hello_fn: extern "win64" fn() -> bool = unsafe {
mem::transmute(buf.ptr(hello))
};
```
We can now get a pointer to the executable memory using the `dynasmrt::ExecutableBuffer::ptr` method, using the value obtained earlier from `ops.offset()`. We can then transmute this pointer into a function.

```
assert!(
hello_fn()
);
```
And finally we can call this function, asserting that it returns true to ensure that it managed to print the encoded message!

# Advanced usage

Coming soon.
4 changes: 2 additions & 2 deletions plugin/src/compiler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -223,9 +223,9 @@ fn compile_op(ecx: &ExtCtxt, buffer: &mut StmtBuffer, op: Ident, prefixes: Vec<I
op_size = try!(get_operand_size(data, &args));

if data.flags.contains(AUTO_NO32) {
if op_size == Size::QWORD {
if op_size == Size::WORD {
pref_size = true;
} else if op_size != Size::WORD {
} else if op_size != Size::QWORD {
return Err(Some(format!("'{}': Does not support 32 bit operands in 64-bit mode", &*op.node.name.as_str())));
}
} else if data.flags.contains(AUTO_REXW) {
Expand Down
8 changes: 6 additions & 2 deletions plugin/src/x64data.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use std::collections::HashMap;
use std::collections::{HashMap, hash_map};

use compiler::Opdata;

Expand Down Expand Up @@ -74,6 +74,10 @@ pub mod flags {
}
}

pub fn mnemnonics() -> hash_map::Keys<'static, &'static str, &'static [Opdata]> {
OPMAP.keys()
}

// workaround until bitflags can be used in const
const VEX_OP : u32 = flags::flag_bits(flags::VEX_OP);
const XOP_OP : u32 = flags::flag_bits(flags::XOP_OP);
Expand Down Expand Up @@ -151,7 +155,7 @@ Ops!(OPMAP;
b"v*ib", [0x0F, 0xBA ], 5, AUTO_SIZE | LOCK;
] "bzhi" = [ b"r*v*r*", [ 2, 0xF5 ], X, AUTO_REXW | VEX_OP;
] "call" = [ b"o*", [0xE8 ], X, AUTO_SIZE;
b"r*", [0xFF ], 2, AUTO_SIZE;
b"r*", [0xFF ], 2, AUTO_NO32;
] "cbw" = [ b"", [0x98 ], X, WORD_SIZE;
] "cwde" = [ b"", [0x98 ], X;
] "cdqe" = [ b"", [0x98 ], X, WITH_REXW;
Expand Down
10 changes: 5 additions & 5 deletions runtime/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,18 @@ use memmap::{Mmap, Protection};
/// this allows it to be used as an easy shorthand for passing pointers as dynasm immediate arguments.
#[macro_export]
macro_rules! Pointer {
($e:expr) => {&$e as *const _ as _};
($e:expr) => {$e as *const _ as _};
}

/// Preforms the same action as the Pointer! macro, but casts to a *mut pointer.
#[macro_export]
macro_rules! MutPointer {
($e:expr) => {&mut $e as *mut _ as _};
($e:expr) => {$e as *mut _ as _};
}

/// This trait represents the interface that must be implemented to allow
/// the dynasm preprocessor to assemble into a datastructure.
pub trait DynAsmApi<'a> : Extend<u8> + Extend<&'a u8> {
pub trait DynasmApi<'a> : Extend<u8> + Extend<&'a u8> {
/// Report the current offset into the assembling target
fn offset(&self) -> usize;
/// Push a byte into the assembling target
Expand Down Expand Up @@ -146,7 +146,7 @@ impl<'a> Extend<&'a u8> for Assembler {
}
}

impl<'a> DynAsmApi<'a> for Assembler {
impl<'a> DynasmApi<'a> for Assembler {
#[inline]
fn offset(&self) -> usize {
self.ops.len() + self.asmoffset
Expand Down Expand Up @@ -381,7 +381,7 @@ impl Executor {
/// A structure wrapping some executable memory. It dereferences into a &[u8] slice.
impl ExecutableBuffer {
/// Obtain a pointer into the executable memory from an offset into it.
/// When an offset returned from DynAsmApi::offset is used, the resulting pointer
/// When an offset returned from DynasmApi::offset is used, the resulting pointer
/// will point to the start of the first instruction after the offset call,
/// which can then be jumped or called to divert control flow into the executable
/// buffer. Note that if this buffer is accessed through an Executor, these pointers
Expand Down
4 changes: 3 additions & 1 deletion testing/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

#[macro_use]
extern crate dynasmrt;
use dynasmrt::DynAsmApi;
use dynasmrt::DynasmApi;

macro_rules! test {
() => (mov rax, rbx)
Expand Down Expand Up @@ -126,7 +126,9 @@ fn main() {
bar: u32
}
let mut test_array = [Test {foo: 1, bar: 2}, Test {foo: 3, bar: 4}, Test {foo: 5, bar: 6}];
let mut test_array = &mut test_array;
let mut test_single = Test {foo: 7, bar: 8};
let mut test_single = &mut test_single;
dynasm!(ops
; mov rax, AWORD MutPointer!(test_array)
; mov ebx, 2
Expand Down

0 comments on commit 43be0cb

Please sign in to comment.