Friday, August 8, 2008

$1 Gesture Recogniser by Wobbrock, J. O. et. al. adaptation in C++

The original codes can be found at http://depts.washington.edu/aimgroup/proj/dollar/

I made a C++ adaptation of the original C# version to use with OpenCV during my internship.
I've uploaded a copy here for academics to use in their C++ projects with as little code change as possible.

For simplicity, the code follows a singleton pattern.

http://www.comp.nus.edu.sg/~situyi/DollarRecogniser/



// Include headers at the top of your code
#include "dollarUtils.h"

#include "dollarRecogniser.h"

......

// Somewhere within the body of the code
Recogniser* gestureRecogniser = Recogniser::Instance();

vector<point> gesturePts;
// Populate gesture pts here
Result res = gestureRecogniser->Recognise(gesturePts);


if(res.GetScore() <= 0.6) {
strGesture = "0";
cout << "Unrecognised gesture." << endl;
}
else {
strGesture = res.GetName();
cout << "Best gesture match is " << res.GetName() << " (" << res.GetScore() << ")" << endl;
}

gesturePts.clear();


Friday, April 18, 2008

New Lightdraw Video

Video: Lightdraw, Alpha 2


Wednesday, April 16, 2008

TIME.com Interview with G.R.L

Graffiti Meets the Digital Age - Interview with James Powderly and Evan Roth of Graffiti Research Lab

http://www.time.com/time/video/?bcpid=1214055407&bctid=1483830664

Monday, April 7, 2008

Week 14 - The Teapot from 1975, Utah.

0407

I've stumbled upon a few videos related to the OpenTouch project that I've mentioned last Friday, 0404.

Introduction to MacLibre and OpenTouch
http://video.google.com/videoplay?docid=-3884258697819652860
Topics covered:
  • MacLibre
  • Multitouch introduction
  • OpenTouch concept
  • OpenTouch modules description
  • OpenTouch gestures

Compiling OpenTouch on Linux and Mac OSX
oscpack
  1. Copy /oscpack/MakefileLinux over /oscpack/Makefile
  2. Change "INCLUDES = -I./" into "INCLUDES = -I./ -I./ip/" !!!important!!!

Some related files posted on GSoC file repo
http://groups.google.com/group/google-summer-of-code-discuss/files


TUIO: A Protocol for Tangible User Interfaces
http://tuio.lfsaw.de/



0408

Today I experimented with Quartz Composer mainly because it has the potential for lots of eye candy and also because it is the easiest way for me to create a splashTop with widgets. I was a little reluctant at first, but this beats having to do the aesthetics in OpenGL where it could potentially eat into my development time.

So I did a little exploring and within minutes I was able to get a cloud to follow my mouse pointer. Then Kevin took this a step further and threw in a Utah teapot that rotates with respect to the cursor location. However, there was a latency between the laser point and mouse cursor, mainly because my Lightdraw application was calling X11 API to move the pointer. This would be slow since X11 is not the native windowing system on Leopard OS X, and also because Quartz Composer was polling the mouse pointer location on every loop.

I would think that the simplest way to resolve these issues is to send the laser coordinates from Lightdraw to Quartz Composer directly, via OpenSoundControl (OSC), much like what OpenTouch uses.



0409

Finally hooked up Lightdraw and Quartz Composer to communicate via OSC protocols. I shall not have to endure choppy X11 on Mac anymore!

To do:
  1. Standardize usage of OSC protocol communication.
  2. Learn more about Quartz Composer. Find out if persistent variables are possible. (DONE, use Javascript)


0410 - Thanks Ben!

Ben from Apple Singapore came over to visit the Lightdraw crew. He kindly assisted us by cleverly introducing a smoothing function to our Quartz Composer Utah teapot demo, and that disguised the visual disruptions caused by unstable motions from a user's trembling hands. I suspect that I could implement the smoothing functions in Lightdraw to correct such visual disruptions too. After which Kevin and I prepared a few demos for tomorrow's video recording and also for future presentations.




0411

Recorded our first Lightdraw video with an entire length dedicated to lazer applications. That's right, its lazer with a "z". The demos recorded involves:
  • Rotating an enironment mapped Utah teapot
  • Moving a rendered image of cloud
  • Navigating Mozilla Firefox
  • Interacting with Google Earth

These demos are more interesting now because of flashy graphics they show how we can easily extend the laser interactions to applications with practical uses.

Here's the link to our previous video. Pardon the crappy music loop. I suggest that we may try using loops from fffff.at, with thanks and credit where credit is due of course.


Outstanding tasks:
  • Multiuser splashtop
  • OSC multiuser protocol
  • $1 gesture recognition module
  • Profiling
  • Auto calibration

Wednesday, April 2, 2008

Week 13 - Back to my blogging routine

0331

Here I am back to my blogging routine. I've just finished my interim report and handed it to Kevin for review. Meanwhile, I'm reading up on calibration references, primarily homography which should correct keystone distortion. There are so many other kinds of intrinsic and extrinsic calibration to be done programmatically (e.g radial lens distortion, projector roll etc) but I do not think I'll have the time for these minor calibration now.



0401

Spent the day revising my linear algebra.
I also came across an interesting gesture article called the "$1 Gesture Recogniser". I've tested the applet and its fairly accurate for such a small piece of code. It will be great if I can add it to Lightdraw package as a gesture recognition module.



0402

Finally I've implemented the calibration using homography matrices, cvSolve and cvFindHomography. At first I screwed up but then I've recalled that I've got to normalize my matrices, since one of my assumptions is that I work with z-component = 1.

To calibrate, all I need to do now is to identify four predetermined non-collinear vertices by selecting their location on the camera image. In future, this method can be automated by projecting recognizable patterns for the camera to pick up.

Now I've got really accurate laser to screen pixel mapping. Yeah!



0403

Following up from yesterday's work, I find that I can place the camera no further from where the projector is. The DVCam should be able to see an area bigger than the projected screen if I positioned the camera at a side, such that the camera would observe a trapezoid projection of the screen. However, the downside of this is that the laser point would lose intensity when observed an angle away from the screen perpendicular. This can be resolved by using a fairly brighter laser, as tested with my 50 mW green laser.



0404

Very soon I've got to develop a multiuser capable application since I am not able to find one that is both multiuser and aesthetically pleasing and yet serves a practical purpose.

In summary, I would have to deliver, sometime in May:
  • A multiuser framework, such as OpenTouch (it is currently still under development). This is everything! I cannot use the current operating systems' single user interface, so I'll have to design my own. The design worries my abit as it is quite challenging.
  • A single application, probably like a splashtop, with many widgets inside. Each widget is controlled by unique user, and multiple users can come together and control many widgets. These widgets will have to communicate with each other of course.
  • Aesthetically pleasing application with nice transition effects.
  • Has to have very responsive interface for it to be usable.
QtRuby may be my solution to my GUI design woes. It doesn't look pretty though. I'm not so sure about those transitional effects, but the framework design (multiuser support) should be my priority.

Finally, I've submitted my interim report today.

Friday, March 7, 2008

Week 09 - Beginning of my Interim Report

0303

We had a Lightdraw presentation to SIGGRAPH Asia 2008 correspondents, and I was surprised by the appearance of my Computer Graphics professor, Dr. Huang Zhiyong. Turns out that he's a member of SIGGRAPH Singapore.

First meeting with open source committee today. We plan to have a custom IHPC OpenSource webpage for Lightdraw instead of the usual SourceForge-ish layout.

For the month of March I'll be spending time on my interim project report and blogging less.



0305

Went with Kevin to scout for a High Definition (HD) camcorder that can function as a webcam. Unfortunately, HD camcorders cannot be used as webcams, only non-HD camcorders can have this feature.

Whackapeng can now be played with variable number of players and avatars.



0306

Translated laser tracking vertices to mouse coordinates. I am able to move the mouse pointer with my laser via X11 API calls.

Debugged a bug in Whackapeng which counts the number of circled avatars incorrectly. This is caused when multiple threads access a common critical region. In this particular case, the bug happens like this:

Thread A counts number of encircled avatars, and increments a global counter each time an avatar is encircled. It then checks this counter against number of avatars to determine if the player has won.
Thread B is the countdown timer, clears all avatar encircled state.

Two out of three avatars are already encircled. Thread A examines an avatar that is encircled but already flagged.
Thread B interrupts, clears the flag of all avatars. And just before Thread B resets the global counter representing the number of encircled avatars...
... thread A interrupts! It sees that the current avatar's flag is not set, and increments the global counter. This incorrectly triggers a winning condition.

Proposed solution is to use a semaphore to protect the critical region where the game is reset at the end of every countdown.


Note to self: X11 path on mac is /usr/X11R6/include

Tuesday, February 26, 2008

Week 08 - "Leap Week"

0225

The demo program "moveresize" is able to work with laser. Thresholding of image affects the detection of laser blobs. Lower thresholds makes the laser dot highly visible but introduces more false positives. With higher thresholds, the laser "drops" once in awhile, so I've added loss tolerance for two frames after my program doesnt pick up the laser.

Kevin simplified Lightdraw into three particular areas the team has to look into.
1. Full screen program, probably with transparency overlay. With graffiti, and gestures.
2. Window interaction.
3. Contextual interaction.



0226

Quickly put together an extremely simply gesture recognition for moveresize. Though it is simple, it is extremely reliable as it does corner detection using Douglas-Peucker algorithm with high rate of success.

Increased the effective area of the binarized pointer size to allow a margin of human error when user wants to initiate a pause (It is difficult to keep your hands still and focus the laser on a point).



0227

Went for this Apple talk in the morning that was about role of Apple computing in Science. We were introduced to specialised software available by Apple that is free, such as "stitching" your displays together or desktop sharing.



0228

Read up on Apple window manager programming. I think X11 windowing system would be a good choice as my SuSe development platform uses it natively and the Mac Pro has X11 as well.
I would want to achieve transparent window overlays and control event signals using X11.



0229

Taken OpenCV motion template apart in order to assess how well it can detect a (anti)clockwise motion detection. Even after removing the direction weights on motion history two or more frames ago, I am not able to get a reliable reading from the motion template. Humans cannot draw a perfect circle, and so the direction tangents do not change uniformly. This makes it difficult to determine when the circular motion starts or ends.

Friday, February 22, 2008

Dave Grossman comments on Blob Analysis Library

This post is meant as an archive, particularly because it is useful and easier to refer to.
Taken from unstable hosting at osdir: http://osdir.com/ml/lib.opencv/2005-11/msg00200.html?rfp=dta

RE: Re: help for CVBLOBSLIB!: msg#00200
Subject:
RE: Re: help for CVBLOBSLIB!

RE: Re: help for CVBLOBSLIB!: msg#00200

Subject:
RE: Re: help for CVBLOBSLIB!
By now I imagine that rickypetit understands the truth of the old adage that "no
good deed goes unpunished!" He deserves a lot of credit for converting my blob
analysis code into cvblobslib. The result has been many more users, and
therefore many more questions. So, I will try to help him out here by providing
some answers. However, these answers all relate to my original code, which was
deprecated about 2 years ago when cvblobslib was created. And I've never used
cvblobslib. So, my answers are probably somewhat obsolete...
I don't have any additional documentation on the blob analysis algorithm. I
first saw this algorithm about 33 years ago in a presentation by Gerry Agin at
SRI International (then called Stanford Research Institute). I have been unable
to find it in any journal articles, so I implemented it by memory. The basic
idea is to raster scan, numbering any new regions that are encountered, but also
merging old regions when they prove to be connected on a lower row. The
computations for area, moments, and bounding box are very straightforward. The
computation of perimeter is very complicated. The way I implemented it was in
two passes. The first pass converts to run code. The second pass processes the
run codes for two consecutive rows. It looks at the starting and ending columns
of a region in each row. It also looks at their colors (B or W), whether or not
they were previously encountered, and whether a region is new or old, or a
bridge between two old regions. There are lots of possible states, but I deal
explicitly with the only states that are important.
The x centroid Xbar is determined by adding the contribution for each row. In
each row, if the pixels of a region are at y1, y2, ..., yn, then just add up all
these values y1+y2+...+yn. (Or equivalently, n[y1+yn]/2.) After finishing all
the rows, you have to divide the accumulated value by the area A.
The y centroid Ybar is determined as follows: For each row x, let the length of
the run be n. Then just add x*n for all rows. When finished, divide by the area
A.
Perimeter is really complicated for 2 reasons:
(1) A region may contain an interior region, so there is an interior perimeter
as well as an exterior perimeter. Since people usually want perimeter to be only
the exterior perimeter, the perimeter of the interior region has to be
subtracted at the end of the computation.
(2) When analyzing a particular row, you can only compute the perimeter
contribution of the prior row. (This is different from area and centroid, where
the computation is for the current row.) I found the perimeter computation very
complicated, and I no longer remember how it works. The moment computation
accumulates X^2, XY, and Y^2. But you really want (X-Xbar)^2, (X-Xbar)*(Y-Ybar),
and (Y-Ybar)^2. So there is an adjustment at the end of the computation once the
centroid values Xbar and Ybar are known.
// Case 1 2 3 4 5 6 7 8
// LastRow |xxx |xxxxoo |xxxxxxx|xxxxxxx|ooxxxxx|ooxxx |ooxxxxx| xxx|
// ThisRow | yyy| yyy| yyyy | yyyyy|yyyyyyy|yyyyyyy|yyyy |yyyy |
// Here o is optional
At each stage, the algorithm is scanning through two consecutive rows, termed
LastRow and ThisRow. During this scan, it sees a region in each row.
In Cases 1-4, the region in LastRow starts 2 or more columns before the region
in ThisRow.
In Case 1, the region in LastRow ends one or more columns before the region in
ThisRow starts. Therefore, these regions are NOT connected. In Case 2, the
region in LastRow starts before the region in ThisRow, but the LastRow region
continues at least to the column just before the region in ThisRow starts. Or it
continues further, but it ends before the region in ThisRow ends. Therefore, if
these regions have the same color, then they are connected.
(Note that I am using 6/2 connectivity, which means that at a corner like this
01
10
the 0's are connected but the 1s are not connected. In other words, the
connectivity of a pixel is given by this mask:
110
1X1
011
i.e., pixel X is connected to the 6 pixels with a 1 but is not connected to the
2 pixels with a 0. This introduces a slight bias in favor of regions that slope
to the left, but I consider it preferable to using either 4-connectivity or
8-connectivity.
*** I believe that the rickypetit cvblobslib generalizes this to allow the user
to choose the form of connectivity.)
In Cases 3 and 4, the region in LastRow starts before the region in ThisRow.
In Case 3, the region in LastRow continues beyond the region in ThisRow; in Case
4 they end at the same column. In Cases 5, 6, 7, the region in LastRow starts at
or after the column of the region in ThisRow. The distinction among these three
cases is which region ends first.
In all of Cases 3 - 7, if the colors match then the regions are connected.
Finally, Case 8 has the region in ThisRow end one column before the region in
LastRow.
These are ALL the possible cases. Depending on the case, the algorithm then
advances to the next region either in LastRow, or ThisRow, or both. When it
exhausts both rows, it then increments the row by one, so LastRow <- ThisRow,
and ThisRow is the next row encountered.

Thursday, February 21, 2008

Week 07 - Blob blog.

0218

I spent the day revising my article for IHPC's newsletter. In my opinion, this is a timely checkpoint for a small self-assessment as well as a short break from development. The article is a brief overview on what Lightdraw is about and why it deserves attention (or not). Writing helps me see how much I've progressed thus far, and serves as a good gauge of how much of my work might actually be important enough to share. Furthermore, having to pen down the methodology in ink means having to explain the rationale of my code (how and when should the code be used). Without explanation code is just a bunch of logic and by itself is not very useful at all.



0219

Finally managed to finish the long overdue Lightdraw video with Peng and JL today.



0220

Favoured Dave Grossman's blob detection over my existing code which used contour detection. The blob detection is much more efficient since it picks up blobs in a single pass. As a result of the tremendous improvement of blob detection algorithm real-time laser tracking is possible.



0221

Studied my code closely, and with some trial and error I found a few pitfalls which reduced the frame rate.
- OpenCV's canny detection reduces frame rate by ~2 fps
- Frame rate is halved after I resized the OpenCV window.
- Morphology operations (i.e dilate, erode) reduces frame rate too. The larger the size of the structuring element used, the greater drop in fps.



0222

Since I'm so intrigued by blob detection efficiency over contour detection, I did some reading on their main differences. Discussion on performance difference between Counter detection and blob analysis and when to use which:
http://tech.groups.yahoo.com/group/OpenCV/messages/38670?threaded=1&m=e&var=1&tidx=1

In a nutshell, BlobsLib is faster for more than 5 blobs. Contour is faster for lesser blobs.
My suggestion is to take the advice with a pinch of salt and do your own benchmarking to see which works best for you.

Saturday, February 16, 2008

Week 06 - Integration with Laser Detection

0211

For simplicity's sake, development for the past few weeks was done in an ideal lighting environment. This week my objective is to move my setup to the back projection system.

First thing I needed was to get the camera to see the laser points at the very least. I felt that this is crucial, because even a perfect laser detection program would not be able to function properly if the video input does not reflect the laser point half the time. The best a program can do is to cope with lossy input and perhaps extrapolate a set of points when there are gaps in a stroke. Still, this requires reliable input from the camera at the frontline in order to provide for small loss tolerance levels.

After trying several kinds of configuration again, my conclusion is that placing the camera at the back with the projectors is ideal. Somehow our previous experiment (ref: post 0118) went wrong. Also, setting the exposure settings to lowest on the DVcam allows me to see only the laser dot, which is excellent really, but the bandwidth between the cam and the computer is slow at only ~15fps. [0218, Edit: I found a setting that enables me to jack up shutter speed using the sports mode, under the menu "Program AE". Joy!] An ideal camera would probably be one with manual exposure and manual focus and manual shutter speed settings that achieves >= 30fps easily.



0212

My school's academic supervisor Dr Teo came down for a visit today to learn what Lightdraw is about.

Demonstration went relatively well, except when I had to switch from the front camera to the one at the back. Mental note: For demonstration use only lightdraw account.

So far I am able to give the coordinate of the laser point on an image, detect its colour and trace its motion. Things get trickier when I have two different coloured lasers crossing paths, then I am not able to tell which path belongs to which, for now.



0213

Learned how to install drivers on Mac computers, and got my USB camera to work with the Light the Mac, but only on lightdraw account. Weird huh.

Debugging ensues...



0214

Today was spent cleaning up my code and trying to make it more efficient. Each time a image processing function is called, it makes a few passes (scanning every pixel of image). Furthermore, every loop calls a few of these functions, causing the program not being able keep up with the frame rate.

I found some articles and samples on blob detection mainly for comparison and while these algorithms highlights all kinds of blobs in general, some are really fast and efficient. Since I want to highlight laser points specifically, I could use some of the readily available blob detection methods and combine it with my own filters, then hopefully I would have something that is both efficient and reliable. Here's an interesting article I found on the comparison between blob analysis and edge detection algorithms in practice: http://archive.evaluationengineering.com/archive/articles/0806/0806blob_analysis.asp
Note: I found the article perspective is biased for edge detection algorithm.

OpenCV provides for blob detection, but the OpenCV community attests to its poor reliability.
Currently I am using another slightly different version (just a difference in structuring elements used) of blob detection that works fine for me, as I have cleaned up the image before feeding it to the blob analysis algorithm. The version of blob analysis that I am using is Dave Grossman's, who claimed that he implemented G J Agin's algorithm from memory after attending the latter's presentation. This is because Agin's paper is difficult to find since it is at least 35 years old! (I even tried looking in ACM and IEEE) Dave Grossman tries to explain the algorithm briefly in his post here
http://osdir.com/ml/lib.opencv/2005-11/msg00200.html

More details on the algorithm here:
http://opencvlibrary.sourceforge.net/cvBlobsLib



0215

Got part of moveresize to work with laser on back projection screen. I can drag the box around now if I move my pointer around, slowly. Looks like there's more optimizing to be done.


Tuesday, February 5, 2008

Week 05 - Happy Lunar New Year!

0204

I had to breakdown moveresize.c (a Lightdraw demo, not written by me) into manageable chunks for a few purposes
  • It's easier for me to document along the way.
  • Refactoring will be much easier after documentation.
  • It will be obvious which portion of code goes into the Lightdraw libraries eventually.
Tuesday will be Peng and JL's attachment presentation, and will also be their last day of work in the Cove.



0205

Peng and JL's presentation went pretty well. Luckily Peng spotted a question the night before and asked me about exceptional common-vertex cases for the in-out algorithm (whether a point is inside or outside of a polygon). Of which I just gave the typical solution from the scan-line algorithm that I learned from Computer Graphics last semester.

Dr. Su Yi gave a few pointers for further developments during the discussion on contour processing. The contours could be converted into point sequences for plotting points as the user draws. The contour detection could then be based on a well defined polygon instead of using the current polygon approximation. This has to do geometry processing, something which I will have to look into.

Wonderful farewell lunch for Peng and JL at Au Da Paolo Petite Salut, a French restaurant located near Holland Village. Food was excellent. Best chicken leg I had in a long time, must go back there again! Did not recognise any of the wines they had there. Cost of wine there is rather steep at $16 by the glass.



0206

Half-Day. Laser tracking with screen coordinates on the way.


0207 - 0208
Happy Lunar New Year!

Tuesday, January 29, 2008

Week 04 - Triggering Events

0128

Went to a local precision optics company to sample the ~532nm bandpass and ~650 longpass filters. The red longpass worked really well and the green laser point was not visible at all despite its high intensity. The laser dots appear as the brightest pixels when the webcamera were fitted with the respective filters.

Back at the Cove, we resumed our UML discussion and added methods to class diagram, mainly to the Lightdraw controller and DisplayManager singletons.



0129

Implemented a feature to detect when the user has the laser point hovering over an area (triggers a hold event). This is done using a separate motion history buffer with a shorter timeout duration.

With this, a user can control when he wants to draw graffiti by triggering the hold event to begin drawing. Drawing ceases when laser point goes out of screen, or when laser is switched off.

Note: Further work can be done to make the detection more accurate. Take the region of pixels the laser point is sitting on, which will have timestamp equal to 0 or some value. Search for the set of immediate pixels with timestamp equal to this value, and empty the complement of this set (i.e setting the timestamp of the rest of the canvas to 0). This will guarantee that only one laser point can trigger the hold event.



0130

Because the UML diagrams have grown so much, Kevin suggested we have the diagrams on the projection screen, and seemed like a much better idea. I had a go with Mac's version of UML diagram software, called OmniGraffle. I felt that it was easy to use, and had all the basic functionalities that I needed. However, I felt that the feature that allows a user to tag notes is quite inflexible. What I wanted was to tag individual methods in a class but it only allowed me to tag the entire class due to some restrictions.

Rebuilt one of the USB infrared pens to a battery operated one instead. The end result works but did not look pretty at all.



0131

Today's UML discussion was mainly on how to handle events. Kevin suggested a signal-slot mechanism, and I got really confused while trying to understand what "slots" were.

So I did some reading up, and these are two of the more informative links I came across.

http://doc.trolltech.com/3.3/signalsandslots.html

http://sigslot.sourceforge.net/sigslot.pdf

The paper has a simple and effective example (using lights and switches) to show how two tightly coupled classes can inherit signal-slot classes to become loosely coupled classes, yet maintain type safety between them. Neat!



0201

Contacted my academic supervisor today to setup a meeting on Tuesday the week after next.

Merging of the programs moveresize and laserMotion still in progress.


Sunday, January 27, 2008

Week 03 - Design, design, design!

0121

The Lightdraw team held a meeting to this morning as planned, and will continue to do so every morning this week.

For once, I found myself in a place where design and planning is not brushed away as if it were an unnecessary and time-wasting process. As this blog is not a platform for me to talk about the kind of development horrors or design architectures that will inevitably-collapse-unto-itself that I've previously encountered in past working experiences, I would just like to say that I am so darn glad that things are different here. Hope, is that you?

With help from everyone, the use case diagram was done without too much difficulty. I felt that it was a good reality check for two reasons. Firstly, it was a good chance for the team to clarify doubts, so everyone can be sure to have a common understanding of Lightdraw. Secondly, this forces me to revise my software engineering concepts.

Originally we had planned to use the afternoon to go Sim Lim today to get our supplies. However, the trip got postponed to Tuesday. I carried on with my debugging and managed to get my program to clear the canvas properly. Did so by splitting update_mhi into two parts: one for marking out motion, the other for indexing each pixel of movement with timestamp in a separate image buffer.

I contacted a local company that deals with precision optics, and they claimed to have the optic filters that I am looking for. However, I wasn't able to get a quotation from the lady since the sales department was busy.



0122

Did domain modeling in the morning. Its like a simplified class diagram, so that its easier to separate the different aspects of our program, and group components with similar functionality together.

We went to Sim Lim after lunch. I was able to find the stuff needed for making an IR Wii pen. I bought:
  • 4x rechargeable AAA batteries
  • 2x IR LED (2-3V)
  • 2x 39 Ohm resistor (0.25 Watts)
  • 2x Usb connectors
  • Crocodile clips
The camera shops did not sell the optical filters that we were looking for.



0123

Spent the morning assembling the two IR pens. How do I know it works? Webcams can see infrared light. Hand phone cameras will work too. With my webcam I was able to tell that the infrared LED can be switched on.

Overloaded the minus '-' operator in imageWrapper.h, so that I can subtract pixel values from an image with another. i.e two image buffers A and B

dst[x,y] =
0, if A[x,y] - B[x,y] is less than 0,
A[x,y] - B[x,y] otherwise.


I needed this to kill the dopplegänger effect in motion detection (it renders the motion twice, with delay). This is caused by the way update_mhi (originally) behaves. A motion is defined as a change in light. And light is given a numerical representation using image buffers. update_mhi finds the absolute difference between two frames, so it defines motion as the pixel difference between the old and new image. So it draws the moving object twice: first from the object's previous location (found in the previous frame), second the object's new position.

By Wednesday, we've got a design pattern that's beginning to look like Models View and Controllers (MVC ). We've also agreed on a nifty menu that we can use to interface with applications. Kevin found two more use cases we missed. Given time, all design diagrams will grow into big ugly monsters, and I've got two (unrelated ones) right here:




Flowchart of my school's software engineering project. Chart is so big it had to be cropped into two pictures. I still find these diagrams hilarious.



0124

We've got a good-looking but incomplete class diagram today. CT was able to write a pseudo graffiti program with ease. We noticed that the developer writing the program had to pass a function pointer though. There's still multithreading that we've yet to take into account.

Added a window that is a cross between canny and motion detection, so I get a visual feedback of where my pointer is when I'm drawing an image.

Got the Wii setup up, works well when wiimote is 35degrees from the LCD plane. Calibration was not as easy as it seemed, mainly because the wiimote was not able to see the infrared LED at certain angles.



0125

Experimented using some kind of grey scale and pattern printed on a transparency as a webcam filter as I thought that it may help detect the laser light on a bright background. No dice, I should use a tinted filter to help to reduce intensity instead.

Colour detection much more successful with my new webcam, Z-Star Micro zc0301p.



















Next would be flood fill, pause detection, calibration and some kind of demo.

We're going to visit the precision optics company next Monday.

Saturday, January 19, 2008

Week 02 - Laser Detection I

0114

The week began with a question: How do I differentiate the laser dot from its background? The bright green dot could easily be a bullet on a presentation slide, or the desktop background, or the mouse pointer, or ....

I would like to think that I am able to differentiate a laser point from the background via:
  1. Color separation
  2. Motion tracking
  3. Canny edge detection
I decided to use OpenCV libraries to implement these methods, with the aim of finding the capabilities and limitations of what I can do for each of the three. Hopefully, one method would be able to make up for the shortcomings of the others. Then, a combination of methods will give me a better approximate for the location of the laser point.

I began working on the HSV colour model. I know that I will be looking for a green hue of high saturation and brightness. The green hue would be in a extremely small range of approximately 180 degrees. I tried to use OpenCV to convert the image from a RGB to HSV colour model, before extracting the individual H,S and V channels from the resulting image. Getting the full channels is no problem at all, but I am only interested in a certain hue. Unfortunately for me, OpenCV uses a very different scale for HSV.

Usually, H spans from 0 to 360 degrees, and S, V would take on values from either 0 to 1, or 0 to 100. OpenCV did not state explicitly what range of values HSV would take on, but a quick look at the sample code (camshiftdemo) reveals that in OpenCV, H spans from 0 to 180 degrees, and values of S,V spans from 0 to 255. Confusion! I would have to try and error to understand the scale used in OpenCV.

Took a little time off after work to help debug Peng's program and caught two nasty bugs.



0115

By Tuesday, I was still unable to extract the laser hue from the webcam feed. I found out that my webcam sees the red and green laser dots as white instead of their respective colours, mainly because of their intensity. I am able to get the webcam to pick up their colours only on two occassions; when the laser dot is in motion, and when the camera is ridiculously out of focus. A comparison between my webcam and the iSight shows that the Apple iSight is able to pick up the colour of the laser dots better than my Logitech Quickcam IM

As progress was slow with HSV, I had to put a hold on this until I can grab hold of a better webcam or think of a solution. I spent the rest of the day working on motion tracking, mainly with the use of the template included in OpenCV.



0116

Motion tracking is far more successful at detecting the laser dot. By using a relaxed version of the problem, I am able to "draw" on a black background by making the motion persistent. The setup involves a smooth non-glossy black wall approximately 2m away from the quickcam.

Motion tracking is also able to give a rough figure of how intense the dot is. The red laser, being less intense than its green counterpart, gives a much smaller dot. Also, gaps appear when trying to draw a line quickly using red laser. This did not occur with the green laser. Noise can be observed around the trail left by the green laser but the noise is not present the red one.



0117

Added removeNoise, which works by downsampling the image followed by an upsampling, using Gaussian pyramid decomposition. The noise caused by green laser is gone for good!

I proceeded to get the setup to work with a projector screen after having success working with a relaxed environment. I noticed that "Whiteouts" always occur at the beginning (and when I clear the image buffer by pressing 'c'), because the image buffer is initially empty, and the next and also first incoming frame is not, so the resulting difference between the two is picked up as motion.

Note: clock() behaves differently on light and my laptop.



0118

Had our weekly update today. Next week, mornings will be spent on design pattern as the project will be shifting to OOP design. I will have to get the draft UML diagrams ready. We plan to buy parts to make the IR pen, and a 4.5m long firewire cable. Parts of IR pen include: 40-70 Ohm resistor for usb powered, usb power adapter, momentary switch, 1.6v infrared LED. I might want to check out light blocking filters too. (I am looking for a 532nm pass-band filter for the green YAG-medium laser, or 650 longpass filter for the common red Krypton-medium laser.
Ref: http://www.seattlerobotics.org/encoder/200110/vision.htm)


The team tried with various setups of the iSight. Having the camera at the behind the screen is not feasible as there is more interference there. [Edit: see 0211, camera @ back is better] So we ended up placing the iSight in front of projector screen to test how well the iSight is able to see the laser dot. We cheated a little by overlaying terminal transparency on the desktop to improve detection of dot with iSight.

Wednesday, January 9, 2008

Week 01 - SAB Week

0107

My Unneeded Prescence

2nd SAB Meeting will be held in the Cove over the next two days (8th, 9th Jan), so everyone was taking turns using the Cove for preparations. The increased signs of activity along the corridor and in the Cove is an obvious indication that the SAB meeting must be a fairly important event. Amidst the temporal chaos, it was decided that it would be best for JL and I to be out of action the next two days. Peng is needed for the demo though.

Also, I have discovered the cause of the unresponsiveness in the Light Draw programs on my laptop. The culprit:
cvWaitKey(n)
After a few trials, I found the uncanny behaviour that the frame will not be rendered if the parameter n passed to cvWaitKey() is <= 1. Weird huh. I would attribute the root of the problem to my laptop (IBM T43 Thinkpad) not having less than 2 millisecond resolution timer, but I do not know this for a fact. The uber machine Light (where we ran our codes on) did not have this problem. Then again, I'm comparing my miserable Thinkpad with a 10,000 monster. (This also enforces my dev hypothesis that if I'm able to get things running smoothly on my laptop, then logically Light should not have any problems running the same program too.)

So, apparently cvWaitKey(1) was (and still is) used in the codes because cvWaitKey(30) was causing too much lag between rendering loops. This reminds me again of the problem that too much processing is done in one loop; rendering, fading, detection, trailing, memory operations etc. are done in one loop. Peng and JL are already looking into multithreading, but my previous experience with multithreading was generally negative. I was not able to control which thread to wake up next (in Java), and I have no control over what the scheduler does anyway. Could be due to my ignorance with multithreading, and I knew it would come back and bite me someday.


0108

My Fix: Artefacts appearing in lower/bottom right half of the screen. Config: ATI, SuSe 10.3

Setting
Option "XAANoOffscreenPixmaps" "true"
under
Section "Device"
in /etc/X11/xorg.conf should solve the problem. !!! Remember to backup xorg.conf first !!!

I've got my laptop running a few hours now, and the problem seems to be resolved. The occurrences are random, so I am not able to tell for sure.

Ref: http://wiki.sabayonlinux.org/index.php?title=Black_borders_around_windows_fix_(ATI)

0111

Friday the team had a morning meeting, we discussed about Light Draw's progress and also planned the next phase of development. Light Draw involves two aspects in general, that is HCI, and recognition. The latter should be our main focus for now, and we will have to explore methods to detect irregular shapes.

Got Peng to show me the student's quarters on the second level.

I tried to find out why Light Draw isn't as sensitive to Red channel. Tried messing with the individual colour channels, and with Peng's help I realised that it isn't enough to work in RGB color representation only. I was looking CT also mentioned that the problem is non-trivial. Will probably need get their help on this again.

Also, I had a go with the Wiimote Whiteboard program mentioned on post 0103, 0104. The setup is fuss free, except for the pairing between the Wiimote and the BlueSoliel program. Instead of holding Wiimote buttons 1 & 2 down while pairing, press the red button found in the battery compartment of the Wiimote. I tested the setup on my laptop and used a remote control instead of a infra-red pen. This gave me problems during calibration.

My first treat from Kevin! Kevin bought dinner today. Woohoo!


Misc stuff:

Draft N for my laptop! Got DWA140 USB dongle to work in Linux. The installation disc that came with the dongle supports only M$. Pffft.

Google tells me that the DWA140 uses the RT2870 chipset. Glad I don't have to open it up and void the warranty.

Download the RT2870 chipset driver at http://www.ralinktech.com/ralink/Home/Support/Linux.html

Some tweaks to the default installation required:
  • Edit Makefile, replace /tftpboot with another path. e.g /opt/dlink/
    Default installation places tftpboot in root directory, which I did not like
  • Follow instructions found in README_STA, would be a good idea to copy it from extracted folder to /opt/dlink/
  • Modify os/linux/config.mk, set
    HAS_WPA_SUPPLICANT = y
    and
    HAS_NATIVE_WPA_SUPPLICANT_SUPPORT = y
  • Copy the folders /os/linux to /opt/dlink/

Sunday, January 6, 2008

Week 00 - Because that's how programmers count.

0102

My First Day at Work

A Wednesday. I reported to work punctual despite the new year mood. (Hey I'm sober) I was introduced to Janice from HR, who assisted me in filling up a couple of lengthy documents (I've noticed that IHPC has more forms than usual and they are particularly long). There is one document which lists the schedule of shuttle bus services, and I bet it will come in handy. I was also given a student pass which has yet to be activated.

My Worker's Quarters

"The Cove" is what we call Lightdraw's HQ. And The Cove was used for meeting in the morning, so I chilled in the Library (which looks more like an aquarium), deleting Laptop's bookmarks, history, stored passwords temp files, system restore files, cookies etc. as I was supposed to give it to IT department for a scan.


My First Assignment

I received my first assignment from Kevin, Lightdraw's research officer. The Lightdraw team is to dismantle the Christmas tree. We rewarded ourselves by eating the Christmas Tree decorations (A choco gingerbreadman and a white pastry-slash-biscuit xmas tree shaped thingy). They looked somewhat edible. I am not in IHPC long enough to know whether they reuse their Christmas tree decorations or not. So Peng, my colleague from Temasek Polytechnic attachment, bravely took the first bite. Monkey see, monkey do. The ornaments didn't taste too bad actually.

Then minutes later, Kevin opened his pack of choco-chip cookies given by Lester to share. D-oh! Kevin also bought us T-shirts from Mauritius. He flew back to Singapore only yesterday.


My Leaking Air-con


Peng and JL
(surprisingly he has the same surname as me), also another intern from Temasek Poly, taught me how to fix the faulty air-con which, Peng says, "fails only after holidays (and holidays only), don't know why."

Here are the steps taught to me by Peng and JL:
  1. Take the make-shift funnel made out of a cut mineral bottle out of the first drawer of the cabinet the air-con is sitting on.
  2. Get ready a pail, stunned (army slang) from the pantry.
  3. Open a trapdoor located at the side of the air-con, and drain the water into the pail.
  4. Close the trapdoor.
  5. Pour the water into the beans compartment of the coffee machine which Kevin drinks from everyday.
  6. Pour the water away and return the pail to rightful owner.

On to serious matters. Peng and JL gave a demo and showcased their progress. Kevin, having just travelled back from Mauritius yesterday, suffered slightly from jetlag, but that did not stop him from giving us a briefing to put the team on track. We (including Cheng Teng aka CT) were given a brief update of Lightdraw's progress. Here are my current tasks, in general:
  • Laser pointer (depth) detection. (Read paper provided by Kevin.)
  • Stretching, possibly detect two pointers and stretch a generic primitive.
  • Mainly HCI, look for ways to generate events to OS.

SAB's coming over, so Kevin did a few trial runs to prep him self for the demo.

Harold joined us for lunch. Spent the rest of the day setting up Suse, Emacs, OpenCV, SVN and Lightdraw.


Note: The access-point in the Cove is Light. External IP of SVN repository noted in bible.



0103

My Excursion

Today the Lightdraw team went to the Visualization Centre (VC) in Temasek Polytechnic. Kevin hooked us up with Mr Tan, who was kind enough to explain the technology behind each of the setups in the VC. The first setup is more related to what we are designing. It projects image onto a mirror which reflects the image onto a piece of vertical glass that has a "special refractive holographic film which scatters light" (Ref: TouchLight: An Imaging Touch Screen and Disply for Gesture-Based Interaction). The image shows up clearer on the viewer's side of the glass. There are two infrared cameras to pick up infrared reflection from the user's hand, and therefore this enables multitouch interaction, like what you see from Minority Report. I've seen something similar before on Youtube, be sure to check out the series of videos from Johnny Chung Lee (HCII, Carnegie Mellon University) at http://www.cs.cmu.edu/~johnny/projects/wii/
The first video is analogous to the technology of VC's first setup. We have Wii back in the Cove, so Kevin suggested that we try out his setup as it looks pretty darn fun and interesting.

There is also a flat plasma display that supports 3D without headgears (stereoscopic) of any kind. The plasma is slightly fatter and the degree of vision for 3D is very limited, possibly 20 degrees by my estimate. Also, the viewer has to stand approx 1.5 metres away for the 3D effect to be noticeable, otherwise the image will look blurry, and that might irritate the viewer.

Next setup uses stereoscoping enabled by polarising glasses and polarising filters on projectors for three rear projections (Two projectors for each of three panels, both projectors use different polarisers). On screen is a simulation of a virtual MRT station. It is similar to setup seen in IHPC, except that it has a nifty handheld controller for X-Z plane panning, and a headgear mounted with ultrasonic transmitters to communicate with 10 (?) strips of ultrasonic receivers mounted on the ceiling (My secondary school ultrasonic project paid off). Disadvantages are obvious: Headgear is heavy and messes up my do, fast movements cannot be detected, significant lag, prolonged use may cause dizziness, requires low hanging ceiling for ultrasonic receivers to pick up signal accurately, headgear comes along with heavy battery pack, setup is huge and requires lots of space.


Last setup is for teleconferencing, It has a normal glass tilted at an angle to reflect images off a LCD to a viewer, as such the image can only be seen from the viewer and not behind the glass. This is important as there is a motorised camera behind the glass to capture the user behind the see-through glass. Our first teleconference? With a cleaner at an office in Aeon. This setup is more suited for single user only as there is only one keyboard, mouse. I would think quality of conference directly depends on how well camera operates and network latency. Camera looks very expensive as it as motors, wide angle lens and ability to zoom at the same time.

I heard the total setup costs 4.2M. You have to be very credible to get that kind of funding.

Once again, thank you Mr Tan! Learnt alot during this trip, and I got to experience advanced tecnology and equipments you dont see everyday.

Wise man says: Five makes bad number for long journeys.

We took bus service 10 all the way back to the Capricorn. The miserable journey is an hour plus. Slept most of it away. Makes one wonder how Peng takes this bus service to work everyday. I Kowtow to you man!


My OpenCV Installation on SuSe 10.3

Back at the Cove, I resumed setting up OpenCV. Finally got it up at the end of the day with CT's help.

Here's what needs to be done. For Suse 10.3.

Google "ffmpeg opencv". First search result:
http://www.comp.leeds.ac.uk/vision/opencv/install-lin-ffmpeg.html

Follow the instructions there with a few tweaks. Instead of installing to /home/ffmpeg/ as the guide says, install it to /usr/local/. Change the paths respectively in ./configure commands.

RTFM! Read all INSTALL and README included in the packages, and the ./configure output as well. Also make sure you have the devels, image libs and GTK+ 2.x which does not come with SuSe by default. For GTK+ 2.x, it is called "gtk2" in Yast, so search for "gtk2" instead.


This is my config from OpenCV ./configure which you will execute later on.
Now may be a good time to download the devels of the packages marked "yes".

Windowing system --------------
Use Carbon / Mac OS X: no
Use gtk+ 2.x: yes
Use gthread: yes

Image I/O ---------------------
Use libjpeg: yes
Use zlib: yes
Use libpng: yes
Use libtiff: yes
Use libjasper: yes
Use libIlmImf: no

Video I/O ---------------------
Use QuickTime / Mac OS X: no
Use xine: no
Use ffmpeg: no
Use dc1394 & raw1394: yes
Use v4l: yes
Use v4l2: yes



Also, only the "make install" commands requires root privileges. You do not need root access for ./configure despite what the guide says.

CT pointed out an essential fix to ffmpeg avcodec.h (Thanks CT!): Patch avcodec.h in /usr/local/include/libavcodec and other places it may be found or used:
Replace:
#define AV_NOPTS_VALUE INT64_C (0x8 0000 0000 0000 000)
with
#define AV_NOPTS_VALUE 0x8 0000 0000 0000 000LL

The line is found at first page of the code.


Afterwhich, opencv may still refuse to compile with ffmpeg. It didnt compile for me. Solution? Dr Voicu from CS3212 would say, "accept it!". That is one of the more common solution to problems in CS. So, take out ffmpeg from opencv ./configure, and mine looked something like this:

./configure --enable-apps --enable-shared --without-ffmpeg --with-gnu-ld --with-x --without-quicktime CXXFLAGS=-fno-strict-aliasing
That should be all there is to compiling opencv. Next, getting my ancient Logitech Quickcam IM to work. Perhaps I should invest in a Logitech Quickcam Sphere? Sponsors? :)



0104

My Webcam

I managed to setup my webcam the night before. Found good sources for Logitech linux drivers:
http://qce-ga.sourceforge.net/
http://www.quickcamteam.net/hcl/linux/logitech-webcams
(go here for Quickcam IM)

Note: Quickcam IM uses spca drivers, not uvcvideo.

For the quickcam drivers, you will also need xawtv, v4l, SDL etc.
I got the SDL off SuSe repository through Yast, wasnt able to compile either of the downloaded rpm and tar file.

spcaview works like magic.

Brought my camera to office, managed to test it with samples included in openCV with Kevin's help. Change access to opencv/samples/build_all.sh to executable, run it to compile the samples.

"motempl" is a good sample to use to test your webcam. "edge" too.

Next, I managed to compile lightdraw codes.

For some reason I've yet to uncover, only "alpha", "edgeBlending" and "trail_v2" is responsive enough. Refer to bible for complete list. Think I'll debug this over the weekend. Also, the codes are half written in C++, so I will rewrite a version in CPP anyway.

Todo:
  • Find out why some programs are unresponsive on my laptop, probably use my desktop as control.
  • Find out why I cannot connect to external IP of Lightdraw repository
  • Read paper
  • Read openCV wiki and manuals.
    http://www.cs.iit.edu/~agam/cs512/lect-notes/opencv-intro/opencv-intro.html looks good.
  • Get laser point detection up.
  • Try to borrow a SATA/External DVD reader off colleagues.
  • Try out Johnny Lee's Wii codes
  • Contact my supervisor from SoC, Dr TEO Yong Meng