Jerry's Space: 10/01/2004

星期二, 10月 05, 2004

[Network] ns2 除錯

原文http://kunlung.csie.org/ns2_experience.html

☆用 gdb 來 debug

使用gdb (GNU debugger)可以對debug更有幫助，
例如難解的segmentation fault

不過ns-2.1b9目錄需要重新compile
編輯 Makefile.in
找到底下兩行

$(CPP) -c $(CFLAGS) $(INCLUDES) -o $@ $*.cc
$(CC) -c $(CFLAGS) $(INCLUDES) -o $@ $*.c

改成

$(CPP) -g3 -c $(CFLAGS) $(INCLUDES) -o $@ $*.cc
$(CC) -g3 -c $(CFLAGS) $(INCLUDES) -o $@ $*.c

執行 ./configure
./make
(因為compiler依據Makefile.in產生Makefile,
若改在Makefile則無效)
完成後，可以這樣使用(我的tcl放在 ~ns/examples )

gdb ../ns

(進入gdb畫面)

r kiwi.tcl

或是 r kiwi.tcl > log1 (r即為run)

通常程式中斷時，用backtrace 可以查呼叫關係
另外b(breakpoint)可以設中斷點

設中斷點的方法：

打 gdb ../ns 之後

進入gdb 命令提示列

打 b [路徑][檔案名]:[行號]

例如：

b mac/mac802_15_3.cc:1500
就是把中斷點設在mac802_15_3.cc的第1500行
最容易錯的地方在於路徑
gdb的current work directory 是執行檔的那個目錄
因為我們是執行ns
所以工作目錄應該是 ns-2.1b9
所以若我要debug mac目錄裡的檔案，前面就要加 "mac/"

設好中斷點後
r XXX.tcl 便開始執行
遇到中斷點就停下來
這時可以用n (代表next)一行一行的執行

配合中斷點

還可以加上條件式中斷

語法：

condition N EXPRESSION

N是中斷點的編號

EXPRESSION是一個敘述句

例如

condition 1 Scheduler::instance ().clock > 1.044412

就是說當時間超過1.044412且到達1號中斷點才中斷

gdb真的很強

EXPRESSION 怎麼寫都有效... :P

☆常見的 bug 訊息與原因

以下列出在make時常遇見的錯誤訊息的原因：
Scheduler: Event UID not valid!
這是scheduler物件的錯誤訊息
最常的可能是有一個timer物件，還沒有expire而你又呼叫它的start function
可能發生在你寫了新的timer物件，或是你更改了原作者的程式流程
中止在一點也沒有錯的程式區段
可能是你新增了新的data member在.h裡
因為交叉include而沒有被compile到
這時試試看 make clean; make
常常就OK了
Segmentation Fault
最常發生的原因是指標所指的物件不存在，或是traverse陣列時跑到陣列外面去了

[Academic] Zipf, Power-laws, and Pareto

Zipf, Power-laws, and Pareto是在Paper上常常見到。這三個名詞常用來描述一個現象"large events are rare, but small ones quite common"。例如，大的地震發生次數很少，小的地震發生次數確很多。大都市很少，小的鄉鎮很多。一些字很少見，但一些字則常常出現(如 'and' and 'the' ).

Reference: http://ginger.hpl.hp.com/shl/papers/ranking/ranking.html

Zipf's law 通常用來描述一個事件發生的數量或頻率( 'size')跟它的排名(rank) r有關. George Kingsley Zipf一位語言學家想找出某英文字出現的頻率有多高，例如 3rd 或 8th或100th常見字出見的次數。Zipf's law指出事件出現的頻率y跟它的排名(rank) r成反比:

y ~ r^-b, with b close to unity.

Pareto則是對大家的收入(income)分佈有興趣。相對於想知道第r個人的收入，Pareto想知道有多少人的收入大於x. Pareto's law使用累積分佈(cumulative distribution function (CDF))來找出分佈狀況。, i.e. the number of events larger than x is an inverse power of x:

P[X > x] ~ x^-k.

Pareto指出只有少數人是億萬富翁，大部分的人都是沒有錢的。

Power law distribution對有多少人的收入大於x不感興趣，相對的，它對有多少人收入洽好是x感興趣. It is simply the probability distribution function (PDF) associated with the CDF given by Pareto's Law. This means that

P[X = x] ~ x^-(k+1) = x^-a.

That is the exponent of the power law distribution a = 1+k (where k is the Pareto distribution shape parameter).

近年來，很多學者發現internet顯示出很多符合power-law distributions的現象: the number of visits to a site, the number of pages within a site, and the number of links to a page, to name a few.

圖1a展示了AOL users'訪問不同網站的圖形on a December day in 1997. 從圖中，可以發現只有很少的網站有超過2000位訪客, 大多數的網站只有很少的訪客 (70,000 網站只有一位訪客). The distribution is so extreme that if the full range was shown on the axes, the curve would be a perfect L shape. Figure 1b below shows the same plot, but on a log-log scale the same distribution shows itself to be linear. This is the characteristic signature of a power-law.

Fig. 1a Linear scale plot of the distribution of users among web sites

Fig. 1b Log-log scale plot of the distribution of users among web sites

Let y = number of sites that were visited by x users.

In a power-law we have y = C x-a which means that log(y) = log(C) - a log(x)

So a power-law with exponent a is seen as a straight line with slope -a on a log-log plot.

總結，Zipf, Power-laws, and Pareto都是在描述同一個現象，只是用不同的觀點罷了。