R API Notes

Multiplot

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
require(grid)
# Make a list from the ... arguments and plotlist
plots <- c(list(...), plotlist)
numPlots = length(plots)
# If layout is NULL, then use 'cols' to determine layout
if (is.null(layout)) {
# Make the panel
# ncol: Number of columns of plots
# nrow: Number of rows needed, calculated from # of cols
layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
ncol = cols, nrow = ceiling(numPlots/cols))
}
if (numPlots==1) {
print(plots[[1]])
} else {
# Set up the page
grid.newpage()
pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))
# Make each plot, in the correct location
for (i in 1:numPlots) {
# Get the i,j matrix positions of the regions that contain this subplot
matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))
print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
layout.pos.col = matchidx$col))
}
}
}

Density, Histogram & CDF

Colors and Styles

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
library(ggplot2);library(reshape2)
f1="O0-O3_ins_stat.txt"
f2="O2-O3_ins_stat.txt"
read.table(f1, header=FALSE) -> d1
read.table(f2, header=FALSE) -> d2
rc = min(nrow(d1), nrow(d2))
x <- data.frame(v1=d1[0:rc,],v2=d2[0:rc,])
data<- melt(x)
m1 <- ggplot(data,aes(x=value, fill=variable)) + geom_density(alpha=0.25) + theme_bw() + xlim(0, 500)
m2 <- ggplot(data,aes(x=value, fill=variable)) + geom_histogram(alpha=0.25) + theme_bw() + xlim(0, 500)
m3 <- ggplot(data,aes(x=variable, y=value)) + geom_boxplot() + theme_bw() + xlim(0, 500)
m4 <- ggplot(data,aes(x=value, color=vvariable)) + stat_ecdf(geom = "point", size=1, alpha=0.3) + theme_bw() + xlim(0, 500)
multiplot(m1, m2, m3, m4, cols = 4)

Two-sample Kolmogorov-Smirnov test

1
2
3
4
5
6
7
8
9
10
11
12
13
f1="fps.txt"
f2="fps_prio.txt"
read.table(f1, header=FALSE) -> d1
read.table(f2, header=FALSE) -> d2
print(summary(d1$V1))
print(sd(d1$V1))
print(summary(d2$V1))
print(sd(d2$V1))
print(ks.test(d1$V1, d2$V1, alternative = 'greater'))
print(ks.test(d1$V1, d2$V1, alternative = 'less'))
print(ks.test(d1$V1, d2$V1, alternative = 'two.sided'))

Idaapi GISTS

Force creation of an assembly function:

1
2
3
4
5
6
7
8
9
10
# get address from unstripped file
def check():
f = idaapi.get_func(ScreenEA())
print [hex(f.startEA), hex(f.endEA)]
# create function in stripped file
def create(ea_start, ea_end):
for addr in range(ea_start, ea_end):
idaapi.create_insn(addr)
MakeFunction(ea_start, ea_end)

Remove stack reference name in current function:

1
2
3
4
5
def unstack():
f = idaapi.get_func(ScreenEA())
for addr in range(f.startEA, f.endEA):
for i in range(5):
idc.OpOff(addr, i, 16)

Setting up Tensorflow with Keras

Following up the deep learning workshop presented in SIS (slides), this note introduces how to setup TensorFlow and Keras in your own computer without affecting your existing installations of Python.

Setup

  1. Follow the instructions on the Anaconda download site to download and install Anaconda.
  2. Create a sperated python environment by issuing the following command:

    1
    conda create -n tensorflow python=3.5
  3. To activate this environment:

    1
    activate tensorflow
  4. You need to activate this environment everytime you want to play with TensorFlow+Keras

  5. Install TensorFlow by invoking:

    1
    pip install tensorflow
  6. Install Keras by invoking:

    1
    pip install keras
  7. If you encounter scipy installation erros here (look for red words), you can invoke the following command and install keras again.

    1
    conda install -c anaconda scipy
  8. Now you can call your python script:

    1
    python xxx.py
  9. Or you can play with a python notebook:

    1
    jupyter notebook

Other resources:

Compile LibVex on Windows

In order to use LibVex in Kam1n0, we need to compile libVex from valgrind on windows.
Dependencies:

  • mingw64 with msys tool installed
  • add mingw64/bin and msys/bin to environment variable

Clone libvex source (from angr repo).

  • git clone git@github.com:angr/vex

We need to update the Makefile-gcc. Specifically we need to define cc and ar.

1
2
3
4
5
6
#ifndef CC
CC = gcc
#endif
#ifndef AR
AR = ar
#endif

Also we need to re-define HWord in libvex to long long int (64bit)
Then just hit make; and we can find the libvex.a file in the vex-master directory.

Sklean+Xgboost Cross Validation GridSearch Tuning

This note illustrates an example using Xgboost with Sklean to tune the parameter using cross-validation. The example is based on our recent task of age regression on personal information management data. The code covers:

  • Reading data
  • Remove invalid records
  • Slicing data
  • Splitting training vectors and their corresponding labels
  • Imputing missing values
  • Scaling dataset
  • Feature selection
  • Grid search

We need the following dependencies:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
%matplotlib inline 

import math
import numpy as np
import xgboost as xgb
import pandas as pd
import matplotlib.pyplot
from xgboost.sklearn import XGBClassifier
from sklearn import preprocessing
from sklearn.preprocessing import Imputer
from sklearn.feature_selection import VarianceThreshold
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_classif
from sklearn.grid_search import GridSearchCV
from sklearn import cross_validation