Got iGoogle?

One of my favourite modules from Google is a iGoogle, it is personalized homepage just like My Yahoo and MS Live. I have been using since the beginning of this year and have been falling in love with it. The Google Gadgets API is very cool, it allows users to create their own gadgets.

Some of the gadgets that attached for mine, very recommend.

“Quote of the day”
“Google Notebook”
“Places to See Before You Die”
“To-Do List”
“Weather”
“Gmail”

“Google Calendar

“Google Reader”
For me, iGoogle helps me organize web things, it is very portable and does add lots of little tips to my everyday life.

Disk scheduling policies with lookahead

Disk scheduling policies with lookahead, A. Thomasian, C. Liu, ACM SIGMETRICS Vol. 30, No. 2, September 2002, pp. 31-40.

Disk scheduling methods that we might already know are concerned with minimizing the seek time, for example, the FCFS and the SSTF methods. However, the summation of both seeks and latency time is more preferable in modern disk. Therefore, the authors introduce some new disk scheduling methods. For example, the SATF policy which takes into account the sum of seek time and latency time is therefore preferable.The authors review the major disk scheduling methods such as FCFS, SSTF, CSCAN, CSCAN-Lai, SATF, SATF, HOL and SATF-RP. They describe the simulation model used to evaluate the relative performance of the disk scheduling methods, and analyze the simulation regarding to those methods. The main contribute is that they extended CSCAN and SATF with look ahead to be able to cope with the dynamic nature of arrivals to the system.

As we might know, we don’t concern a capacity of disk as a major issue like before, and the speed of the seek time became much faster than before. I believe a disk scheduling method is suited for some specific data, it seems to me like there will not be such a method that can optimize all data which is stored in the disk. My question is that they should have a disk scheduling method which acts like the MTLQ (Multi level queue) that we have studied in the early chapter, where we could select right algorithm and move up and down depends on the starvation level. That should be very more interesting.
In my opinion, the read and write speed could improve by increasing speed of motor and some more mechanical stuff rather than using scheduling methods, of course there would be some improvement but only minor, since today we don’t feel that the bottleneck of transferring data is occurs at memory device.

For this paper, I had like to rate the significance of this paper as 3/5(modest), because 20% of the paper review the scheduling methods which most of us already know, the simulation doesn’t show us a significant result of improvement of disk utilization, and this should be the most noticeable deficiency of the paper.

Bigtable: A Distributed Storage System for Structured Data

To be able improve applicability, scalability, performance and availability in data storage for large data, the authors have implemented and deployed a distributed storage system which is called Bigtable, and this would be the main motivation of the paper. To manage large data, the system provides a simple data model for dynamic control over data layout and format for clients as describe as following paragraph.

For their contributions, the authors have spent roughly seven person-years on design and implementation. They have introduced an interesting model which a map data structure, the concept of row and column families, and time stamps which form the basic unit of access control and so on. Also the refinements and the performance evaluation which describes in the paper have shown an improvement. Three of the real applications or products have success by using the Bigtable implementation and concepts.

The paper’s single most noticeable deficiency already describes by the authors in the paper which are the following. For example, consideration of the possibility of multiple copies of the same data doesn’t count; a permission to let the user tell us what data belongs in memory and what data should stay on the disk rather than trying to determine this dynamically. Lastly, there are no complex queries to execute or optimize. The Bigtable seems to take to another whole level of manipulating the data, however my question is still concerned about the networking such that it seems to me that the latency plays an important role to be able to retrieve or display the result of queries. In my personal opinion, there is still a bottle neck because it is a distribute servers which require a high-performance network infrastructure to achieve the highest performance.

I would rate the significant of the paper 5/5(breakthrough) because of the Bigtable model system is amazing such that it could adapts to handle some very large data, and it has been used in many popular application that we have been using nowadays, for examples, Google products such as Google earth and Google analytics and etc. The concept of adding a new machine when it needs more performance to perform database operations is spectacularly. I believe that the Bigtable will be very useful in future use, and we will most likely to see the next coming products from such companies take this model to approve their use of database.

Reference:
Bigtable: A Distributed Storage System for Structured Data, F. Chang, J. Dean, S. Ghemawat, W. Hsieh, D. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber, Proc. of the 7th Conf. on USENIX Sym. on Operating Systems Design and Implementation, November 2006, pp. 205-218.

Serverless Network File Systems

Serverless Network File Systems, T. Anderson, M. Dahlin, J. Neefe, D. Patterson, D. Roselli, and R. Wang, Proc. of the 15th ACM Symposium on Operating Systems Principles, December 1995, pp. 109-126.

The authors believe that the traditional central network file system still has a bottle neck, such that all the miss read/write goes through the central server. It is also expensive, such that it requires man to control or operate the server to be able to balance the server loads. Therefore, they have introduced a server less file systems distribute file system server which responsibilities across large numbers of cooperating machines. Ideally, the authors have implemented a prototype serverless network file system called xFS to provide better performance and scalability than traditional file systems.

There are three factors which motivate their work on the implementation of the serverless network file systems: the first one is the opportunity to provided by fast switched LANs, the second one is the expanding demands of users and the last one is the fundamental limitations of central server systems.Taking about their contributions, the authors make two sets of contributions. Firstly, xFs synthesizes a number of recent innovations which provide a basis for serverless file system design. Secondly, they have transformed DASH’s scalable cache consistency approach into a more general, distributed control system that is also fault tolerant. Moreover, they have improved the Zebra to eliminate bottlenecks.

The paper’s single most noticeable deficiency is the limitation of the measurements, such that the workloads are not real workloads, and they are micro benchmarks that provide a better performance in term of parallelism than real workloads. Another limitation of the measurements is that they compare against NFS, hence scalability is limited.

This paper seems very solid and interesting to me, I like many ideas, for example, the idea of taking advantage of the cooperative caching to server client memory. However, I still have a question regarding to the future work and its limitation such that, what would be a real workloads the author most likely to measure on and how much expectation would the author prefer to see according to such workloads.

I would rate this paper 5/5(breakthrough) due to the challenging idea and how the authors implements and their measurements. It improves the old fashion server in term of performance, scalability, and availability. It could also help reduce the cost of hardware.

The Multics virtual memory: concepts and design

The Multics virtual memory: concepts and design, A. Bensoussan, C. T. Clingen and R. C. Daley, Communications of the ACM, Vol. 15, NO. 5, May 1972, pp. 308 – 318.

As we might know, the use of on-line operating systems has been growing as well as the need to share information among system users. However, they share by the use of segmentation. This motivated the authors, such that, in order to take advantage of the direct addressability of large amounts of information which made possible by large virtual memories, the authors are motivated to develop a Multics (Multiplexed Information and Computing Service) to provide a generalized basis for the direct accessing and sharing of online information. There are two goals; the first goal is it must be possible for all on-line information stored in the system to be addressed directly by a processor. Another goal is that it must be possible to control access.

Regarding to the authors contributions, the authors have introduced an idealized memory by using the segmentation and paging features of the 645 assisted by the software features. Also, to take some advantages of existing mechanism , the Multics processes and the Multics supervisor were introduced The symbolic addressing conventions technique also provide an ease of use for users, such that a user can reference a segment’s pathname and supplying the rest of the pathname according to system conventions. Moreover, by making a segment known to a process and improve the segment fault handler have given the Multics a lot of performance.

The paper’s single most noticeable deficiency is that there are too many assumptions, so it makes the readers pretty confused of how to use the features of the Multics. The conclusion of the paper should summarize what the authors have contributed and how to improve it in the future work, instead of showing of user and supervisor view points. It would be good if the authors emphasize of how the selection algorithm work. For the question according to the paper, I would like to know how much it improves from the old fashion of the concept.
Lastly, I would rate the significance of the paper 3(modest) due to the fact that this paper is published 30 more years ago. It lacks of experimental and compare/contrast with the use of segmentation.

A Dynamic Data Race Detector for Multithreaded Programs

Eraser: A Dynamic Data Race Detector for Multithreaded Programs, by Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson, ACM Transactions on Computer Systems, Vol. 15, No. 4, November 1997, pp. 391-411.

According to the paper, Erase: A dynamic Race Detector for Multithread Program, the authors claim that dynamic data race is hard to detection, so programmers are suffered when programming by using thread. There are already work that solving about the data race problem from Lamport’s happen relation, however, it costly so they would like to introduce a new method. These are the main motivations of the author regarding to the paper.

The authors contribute by introducing a dynamic race detection tool which is called “Eraser” this tool will monitor the program when it reads and writes when it executes, they state that the tool is more effective and un-sensitive than manual debugging. Another important of the main contributes of this paper is a Lockset algorithm, which will use to detect the data race in multithread programs.

Moreover, about the Eraser detection program, the program can detect race condition in Operating Kernel. For their experiment, the authors test Eraser on the real programs and applications. the HTTP server and indexing engine from AltaVista, the Vesta cache server, the Petal distributed disk system and various programs from programming assignment from students. However, the author is not concerned about its performance due to the high overhead. However, but the authors believe that it is fast enough to debug most of the programs and focus on the false alarms of the program when it found the data race.

The most efficiency of this paper is that the program Eraser cannot prove that the test program is race data free. Also, checking for dynamic data race is impractical. The experiment methods should cover most of the operating systems that we use these days and various of programming language should be tested instead of having only C++ programming language. Moreover, the use of the Eraser program should be describe for the audience, so they can know how the program works out for each test programs. The graph and performance should be provide instead of describing what happen for the program they run on.I would rate the significant of this paper 4/5(modest) due to the challenge topic and idea.

Scalable Threads for Internet Services

Capriccio: Scalable Threads for Internet Services, R. V. Behren, J. Condit, F. Zhou, G. C. Necula, E. Brewer, Proc. of the Nineteenth Symposium on Operating System Principles (SOSP-19), Lake George, New York. October 2003, pp. 268-281.

Thread-based versus event-based programming has been a popular topic recently. For this paper, the authors have shown a strong motivation and contribution such as developing a scalable thread packet for use with high-concurrency servers which is called Capriccio.

The authors have noticed a lot of disadvantages of using event-based programming. For instance, the “stack ripping” where programmers have to save and restore live state is too complicated to use. The authors believed that by using thread-based could make life easier and could also achieve high currency just like the event-based programming as well.

In order to make thread-based model to be better than event-based model, they have build the thread package under the user-level threads, due to the fact that the user-level thread have more advantages in term of performance and flexibility over the kernel one. The implementation of Capriccio is amazing such that we don’t have to modify our applications to be able to use features from the thread package. Capriccio uses and takes advantages of new mechanisms from the latest Linux for its synchronization, I/O and Scheduling mechanisms. This is the reason why the result from the benchmark which they showed in the paper is surprisingly good for thread creation, context switch and so on; it is faster when comparing to the original Linux threads and the others comparators.

The idea of introducing linked stack management, resource-aware scheduling, blocking graph and modify some algorithm are surely improve the system utilization. Base on the performance from their evaluation which they compare between the default web servers such as Apache, Haboob, the results looks realistic. Because of the benchmarks they use are the real world application, and the Capriccio performs very well for both scalability and scheduling.

However, we already know that there must be some disadvantages of using thread-based model. One of them which I am very concerned is the issue when having multiple processors for both homogeneous and heterogeneous chip types. The authors mentioned the drawback of user-level threading such that it could make it more difficult to take advantage of multiple processors. As we know, SMP (symmetric multiprocessing) or CMP (chip multi processor) like Intel duo core has been increasing in the computer market these days. I wonder if the thread-based model will take advantages of having multiple processors more than the event-based model or not. What if we try to fix both user-level and kernel level threads instead of employ only the user-thread level. The future work section in the paper doesn’t give much detail regarding to the issue.

Lastly, I would rate the significant of this paper 5/5 (breakthrough) because they have use and modify many mechanisms and creating a new thread packet to show us that thread-based programming is better to use for high-concurrency internet servers. Their dedication and ideas are impressive

World of Warcraft, a time drainer.

For the past months, I have been really hooked with the most popular online game in the world. The game is called “World of Warcraft“, a polished game from Blizzard Entertainment which is my favourite game development company. World of Warcraft is a very socialize and addicted online game. According to the news, there are currently 8 millions subscribers are subscribed to play this game.

So why I am telling you here about this game, because I have spent lots of my time killing monsters in the game. Three years ago, when the game was first released, it was so bored for me, I couldn’t handle doing the same thing over and over again, I couldn’t find any fun of it, so I went to look for something else to entertain my brain. How’s about having a robot to play the game for me while I study…. sounded fun to me.

I was in the scenes of Wowbot, ISXwow and Wowglider for a while, These softwares will play the game for you while you away from keyboard. They are third party software and against the game term of services. It is cheating!, some gamers take serious about the use of these kind of softwares, but I found it is quite interesting, the developers of the softwares are very talented and they already earned lots of money from their inventions. However, many players have been banned by using these softwares, because they can be detected by the anti-cheat program from Blizzard which is called “warden”.

For those people who don’t know about warden, it is a small program which integrated with the game, update itself dynamically. It acts like a spy ware, such that it looks at the user processes in memory or even read the user hard drive and scan for the suspicious botting programs, if it founds out that the user has such programs installed or running, that user game account would be flagged and most likely to get a banned stick.

Regarding to the software design, at that time, Wowglider were implemented in different way of others, instead of using code injection like Wowbot, Wowglider focused on the manipulation of the mouse and keyboard so it was harder for warden to detected. I am not so sure in detail of how the developers implemented those features, so I had better not to speak. About ISXwow, it is an extension of Innerspace program which is acted as layer between OS and game applications, so the software has more flexibility and can support many DirectX games. I also contributed by wrote a client-server application in JAVA language, it is a small tool which is hooked with the software, so that I could remotely monitor and control my cartoons from my workplace 60 miles away.

Since the warden has been released, the accounts which found out to be involved with third party softwares were banned from the game. The developers have been trying hard to avoid the warden detection. Due to the increment of banned users, Many of them had to stop the development of the program and released their product as a open source code under license.

There are many debates about the right of using these automate softwares or “botting” programs around Internet. Some people think it is cheating, but some people think it is another way they enjoy their game. So even they got banned, they will be happy to get a new copy of the game and come back to bot again.

In previous months, I began to play this game again(manualy at level 70), somehow I was hooked and can’t get away from my desk. I felt it need a rehab and finally I decided to quit playing the game after the release of 2.3 patch, with 1870 points for 2v2 Arena team for. For PVE, I was one of the founders and officers of the guild. I have lead my guild members from 10 man Karazhan to the 25 man Serpent shrine.

Due to the limited time I have, I must say good bye to this game. The game is great and it was fun while it last but you know what? I feel that I shouldn’t have played this game. It sucked too much time from my life. My suggestion is don’t ever think about playing it, trust me…it is really addicted. I see many kids playing this game like a full time job. It is sad.

Here are a clip and pic I left as a memorial of this game.


Why Events Are A Bad Idea

According to the paper, ‘Why Events Are A Bad Idea (for high-concurrency servers), R. Behren, J. Condit and E. Brewer, Proceedings HotOS IX, Kauai, Hawaii, May 2003, pp. 19-24.’ As we know, thread versus message passing(event-based) programming has been debating in term of which is the best in term of performance lately, and many people believe that the event-based programming is much better in many ways than thread programming. In the paper, the main motivation of the authors is to show that thread programming is better than event-based programming in highly concurrent applications environment. They have shown us that thread could perform about the same as event-based in many criticize cases and it could have done better if we have fixed the complier. In other hand, they have concluded that thread will outperform event-based programming by judging from their analysis from the simulation they built. For this review, I will explain the authors main contribution, theirs deficiency. Lastly, I will rate the significance of the paper based on my personal opinion.

According to the paper, the authors has shown us the different between events and threads in term of their responsibilities such that events use event handlers and send /wait for messages, while threads use the function forks and so on. They also describe the problem with threads which has been criticism from other who think that event-base does better, such as performance, control flow, synchronization, state management and scheduling. They proved that these problems caused by the implementation of the programmers.

To make us believe that thread could perform better than event-based, they points of the two important properties of why thread could do better. For example, in modern servers, the requests from the client are independent, and the code which handles the request is sequential. So, they came up with the experimental by modify the compilers and integrate the complier and runtime system. Moreover, they ran the simulation and analyze the results such that event-based requires too many contexts switches and use too much heap due to the fact that its execution is so dynamically. Therefore, they conclude that the thread avoids this kind of problem and could give us a better in execution time.

In my opinion, I think the deficiency is that they haven’t done enough experiments with other cases such as they could test on other operating systems, or by using other benchmark suits to test on various inputs before they conclude that the simple thread programming perform better than the event-based one. However, thread versus message passing is an interesting topic, but in term of practicing in real world applications, it would cost so much time and afford to modify or integrate the complier and runtime like they mentions in the paper. Finally, what if their future results show a big advantages of thread and huge different in term of performance between them, but in reality many programmers still don’t quite understand how the thread really work, so are we going to achieve the utilization of the computer resource we have? I would rate the significance of this paper 3/5 because of the lack of evidences in term of real-application and the lack of references from others research which support the author’s arguments.