Review: Practical Computer Vision
The idea of computer vision has always fascinated me. The ability to get from a plain image to an understanding of it's contents seems magical. Though I understand a bit of the underlying math, to build my own computer vision system would take years of study. Fortunately, this book and an open source library come to the rescue.
The first half of the book starts with an explanation of basic computer vision concepts then jumps right into build a simple time lapse photography app with a few lines of Python code. Next comes image manipulations such as cropping, color reduction, simple object detection, and histograms. This is enough to create a blue screen effect and parking detector.
The second half contains the real meat of the book: detecting features. SimpleCV can pick out different shapes, filter by colors, look for faces, and even scan barcodes. One of the examples looks at a table of change to calculate the monetary value using coin size. The final chapter covers some advanced techniques like optical flow and key point matching.
While I like the book overall I do have a few nits. First, I really wish it was printed in color. Several chapters have images which can't be easily distinguished when printed in black and white. Second, I wish it was longer. While the book does covert almost every feature of SimpleCV, I'd love to read some larger example apps that combine multiple techniques. All that said, the book was still a good read and informative. It will stay on my shelf for future imaging projects.
Tue Feb 05 2013
OSCON 2013 Ideas
Below are my three main session proposals for OSCON, plus a few random ideas near the bottom that aren't fleshed out. Please give me some feedback on what you like and don't like. My goal is to have four really solid submissions. Thanks!
HTML 5 Canvas
Games account for about half of the apps in the typical app store. They are among the first thing ported to any new platform. Games help drive technology forward. This year's edition of the popular HTML Canvas Deep Pe will focus specifically on building cross platform games for mobile and desktop. We will cover everything needed to build basic games with animation, scrolling, sound effects and music, image loading, sprites, and even joystick support. Then we will learn how to package them to run on desktop and mobile devices, both in and outside of app stores.
- Why make games?
- Why make games in Canvas?
- it's easy and fun!
- works everywhere. more places than any other graphics API.
- keeps getting faster and more powerful.
- Anatomy of a game
- game engine: it's more than just a run loop
- images: sprites, models, and more
- input: keyboard, touch, and joysticks
- animation and player control
- scaffolding: menus and splash screens
- picking a game engine
- 2d engines
- 3d engines
- rolling your own
- drawing to the screen with movement
- handling input:
- regular events: keyboard and mouse
- multi-touch: utilities to help
- gamepad / joystick
- background music
- sound effects
- doubling up the playback
- considerations for mobile
- resource management
- finishing touches:
- splash screens
- loading screens
- sharing your game with the world
- on the web
- mobile app stores
- desktop app stores
- tools to help
- case study
- using 3d for 2d work
- tools to help
- performance tuning
- resource editing
- existing artwork and music you can use
Make both a full three hour workshop and a 1 hour talk with the lessons?
Designing The Internet of Things with the 3 Laws of Robotics
Thanks to cheap sensors and even cheaper computing, we are rapidly approaching the age of the smart home: living spaces filled with smart things. Objects connected to each other and to the internet. Thermostats, door switches, lights, windows, gas sensors and toilets. However, this vision of things to come brings great challenges as well. How do we design interfaces for these devices? How can someone manage a house full of 200 gadgets each demanding new batteries and an IP address? What if your networked toaster rats you out to the FBI? The challenges of building a safe and understandable Internet of Things are immense. There is one existing ethical framework that can help: Isaac Asimov's Three Laws of Robotics.
In this session we will explore the complex interactions of the Internet of Things and see how the classic Three Laws of Robotics can be applied in these situations. We will cover physical safety, data privacy, setup and maintenance, and general usability. No knowledge of programming or interaction design is required, just an open mind and a desire see the future.
- The Internet of things?
- Why is it cool?
- What counts and what doesn't?
- Inside and outside your home.
- A quick survey of the problems IoT creates:
- data privacy
- physical security
- physical safety
- data overload
- management overload
- The Three Laws of Robotics
- fictional and non-fictional history
- guidance to solve our IoT problems
- Do Not Harm a Human
- physical safety
- emotional safety
- preserving privacy
- Obey Orders From Humans
- The principle of Least User Astonishment.
- manual overrides
- decision delegation
- heuristic design
- Protect Own Existence
- Easing the management burden
- Escalation of emergencies
- Safe failure
- Next steps
A survey of visual programming languages
Pure visual programming languages sound like a great idea. Who wouldn't want to create robust and powerful programs using more than just lines of text? It is one of the holy grails of computer science, yet success has proven elusive. The last fifty years of research are littered with the corpses of failed attempts (along with a few interesting successes in unexpected areas). Why is it so hard to create a visual programming language that works in the real world? In this session we will explore the history of visual programming, looking at both the failures and successes from the fifties through to modern day, We will look for clues about what works and what doesn't. We will extract concepts that can help us design visual languages in the future, as well as features to bring back into traditional programming environments.
- What is visual programming?
- Why visual programming?
- a picture is worth a thousand words, so it's more expressive. right?
- similarity to visual structures we already use (UML, state diagrams, GUI builders)
- non-programmers can program.
- for teaching programming.
- what counts and what doesn't.
- visual studio, no; visual basic yes;
- visual aids to traditional programming don't count.
- some part of the application must be specified in a purely visual manner. VB forms count. same with access forms.
- early attempts in the 50s and 60s
- the mother of all demos.
- 80s era research.
- smalltalk visual environments. didn't quite make it. why? what held them back?
- soviet visual programming recently uncovered.
- visual basic
- access and other visual databases
- flash, director, multi-media languages.
- music composition
- quartz composer
- educational languages
- educational languages:
- squeak & etoys
- lego mindstorms
- blockly: abandoned?
- android builder thingy: abandoned?
- some tasks work visually. others do not.
- UI layout
- drawing, animation, movies. any media creation.
- anything where direct manipulation helps
- where a boxes and lines metaphor already exists: music sequencers. (though it doesn't result in very extensible code)
- traditional stream and graph based algorithms
- anything dealing with strings or non-visual data structures
- building libraries. reusability seems to be especially hard to get right.
- crossover ideas:
- colors and images in a traditional editor
- rapid/instant feedback. Processing.
- show/hide overlays for interesting information.
- use of color, typography, visual layout to display purely text based code.
- mixing visual with non-visual works very well. assemble components visually. build components in regular code.
- separate editing from viewing: greek symbols and other very compact representations of algorithms?
- hide the filesystem. you don't care how your code is stored on disk. dir structure is irrelevant. Smalltalk had this.
A Few Other Ideas
My 'game editor inside the game as the game' idea.
Hacking Things Up with WebKit-nix:
Nix is a port of WebKit2 based on Posix and OpenGL/ES. It is unique due to it's portabljilty and few dependencies. While it can be run on a traditional desktop environment and GUI toolkit, it's most interesting use is for embedded systems where a full GUI may not be abailalble, and for headless applications where there is no live graphics environment. This session will cover what Nix is, how to compile it,
- list of things people have done with it
- how to compile it
- how to integrate it into a server side app
- how to integrate it into a client side app with direct GL rendering
- running it on a raspi w/o X running
- next steps and places to help
Working with the Raspberry Pi as a kiosk: no X, boot right into your app
Intro to Bluetooth Low Energy: iOS, desktop, raspberry pi, arduino.
Mon Jan 28 2013
Questions We Must Ask
Progress comes not from inventing new answers, but from discovering new questions. -- some guy
I am bored of technology. As you might guess, this is kind of a problem for someone who is a professional technologist. Sadly, I can't help it. I spent five years working on advanced GUI toolkits, then three working on cutting edge smartphones. As I watched the parade of CES announcements this month I found myself being simply, well… bored. Bigger TVs. Faster smart phones.YouTwitFace social networking integrated into everything. Nothing genuinely new. Nothing to really get me excited. The last thing that really made me say 'wow: this is the future' was the first demos of XBox Kinect; which sadly have yet to live up to their potential.
What is wrong? We live in an age of computing abundance. My Roku connected TV can stream shows and music from the last fifty years, plus play Galaga. I have five smart phones, each more powerful than a top of the line desktop from the mid 2000s. I can video chat with family two thousand miles away. Clearly we live in the future. So why am I so bored with it all?
I think I am unimpressed because these are technologies I have long expected to be here. Since I was a kid I assumed we would have faster computers, video phones, and ever smaller gadgets. Today's smart phone is merely the latest version of the PalmPilots and Newtons I played with nearly twenty years ago. That they have finally arrived in fairly usable form is not a triumph, but merely expected.
There are only two things that seem interesting to me right now. First is the Raspberry Pi. The Pi is very underpowered by modern standards. A 700mhz CPU with 512MB of RAM seems paltry, but combined with an insanely powerful GPU you get an amazing computer for 35$. Never before has this been possible. A change in quantity, if large enough, can become a change in quality. The Pi feels like that to me. But..
Software on the Raspberry Pi still feels slow. Compared to what I had a few years ago it should be massively fast. Is our software simply to crufty and bloated to run efficiently? The Pi should be the new baseline for software. Your app should run smoothly on this computer, and it will run even better everywhere else.
There is one other thing that interests me: my 19 month old son. As I see him explore the world and discover language I once again feel the wonder from my own childhood. The pure joy of learning new things is infectious. Perhaps that is why I find myself again looking for the 'new' in the technology realm.
So, I am searching; and researching. I've spent the last few months looking at computer science papers from the 70s to the present. It's depressing to see how every new programming technology has existed for at least 30 years. I've also been scouring blogs, Reddit, used book stores, and anything else I can find in my quest to answer the question: What is next? What seems futuristic now but will seem obvious in a decade. What will replace social networking and gamification as the next wave to drive the industry forward. What new programming concept will finally help us make better software?
If you are hoping for me to give you answers, I'm afraid I will disappoint you. My crystal ball reveals no silver bullets or shiny trinkets from the future. I cannot tell you about live in a decade. I can only offer a few thoughts on what we should be building now, that we might live in a future so packed full of technology it will bore us to tears as much as the present. These are the questions we should be asking.
Can multi-processor computers change our lives?
I recently reread some of the original papers around Smalltalk and the Dynabook. The belief at the time was that personal access to high speed computing technology would change how we live. The following thirty years have shown this belief to be true; but are we nearing the end of this transformation?
It is now generally accepted that the future of Moore's law is to have parallel CPUs rather than faster ones. This is great for server side developers. The every day programmer can now finally use the last thirty years of research in parallel computation. However, the desktop computing experience hasn't really changed. My laptop has four cores, and yet I still perform the same tasks that I did a decade ago.
The real question: Are there tasks which local parallel computation makes possible that would change our lives? What new thing can I do with my home computer that simply wasn't possible ten years ago? Hollywood of the 90s tells us we should be doing advanced image analysis and global visualizations with our speedy multi-core processors through holographic screens. Reality has turned out less exciting: Farmville. Our computers may be ten times faster, but that doesn't seem to have actually made them better.
How can we replace C?
I can't believe we will use C forever. Surely the operating system on the Starship Enterprise wasn't written in C, and yet I see no way to replace it. This makes me a sad panda.
I hate C. Actually, I don't hate C: the language. It's limited but good at what it does. Rather, I hate C compilers. I hate the macro processor, I hate header files. I hate the entire way C code is produced and managed. Try porting an ARM wireless driver across distros and you will agree. C code doesn't scale cleanly. And yet we have no alternatives? Why?
I think the key problem is the C ABI. I could write a system kernel or library in a higher level language, but to interoperate with the rest of the system I must produce a binary blob compatible with the C ABI. This means advanced constructs like objects can't be exposed. Library linking is further complicated by garbage collection. If one side of a function call is using GC and the other is not, then who is in charge of cleaning up allocated memory? With C it is simple. A linked library is no different than if you had included the code directly in your app. With a GC'd language that library now comes with it's own runtime and background processes that must be managed.
Header files don't help either. If I wish to call C code from a non-C language I must parse the entire header file, or hack it in through some language specific FFI. Since .H files are essentially Turing complete, they must be processed exactly the same as a C compiler would, and then predict how the compiler generated the original binary. Why doesn't the completed binary contain this information instead of me having to reverse engineer it from a macro language.
All languages provide a way to link to the C ABI. So if you want to build a library that can be reused by other languages, you have to write it in C. So all libraries are built this way. Which means all new systems link only to the C ABI. And any new languages which want to be linked from other systems compile down to C. You could never build an OS in Go or Ruby because nothing else could link to the modern structures they generate. As long as the C ABI is the center of the computing universe we will be trapped in this cycle.
There must be a way out. Surely these are not insoluble problems, but we have yet to solve them. Instead we pile more layers of abstraction on top. I'm afraid I don't know the answer. I just know it is something we must eventually solve.
How can we reason about software as a whole?
I'll get into this more in a future blog, but the summary is this. Too much effort is spent trying to improve programming languages themselves rather than the ecosystem around them. I've never felt like lack of concurrency primitives or poor type systems were the things preventing me from building amazing software. The real challenges are more mundane problems like trying to upgrade a ten year old database when an unknown number of systems depend on it. The problems we face in the real world seem hopelessly out of sync with the research community. What I want are tools which let us reason about software in the large. All software. All of it.
Imagine a magic database which contained all of the source to the codebase you are working on, in every revision, and with every commit log. Furthermore this database understands every programming language, data format, and config file you use. Finally it also contains the code and history of every open source project ever created. (I said it's magic, remember). What useful questions could you ask such a database? How about:
- Is library X integrated or is it really a collection of classes is several groupings that could be sliced apart, and which classes should we target. The Apache Java libraries could really benefit from this.
- Is there another open source library which could replace this one, and meets the platform, language, and memory dependencies of the rest of my system?
- How many projects really use method X of library Y? Would changing it be a big deal?
- What coding patterns are most repeated in a full Linux distro? How many packages would have to change to centralize this code, and how much memory would it save?
- We need ways to reason about our software from the metal to the cloud, not just better type systems. It would be like having a profiler for the entire world.
How can we make 10x denser batteries?
While not software related directly, batteries impact everything. I'm not taking about our usually 5% a year improvements. I mean 10x better. This requires fundamentally new technology. It may seem mundane but massively denser batteries changes everything. It becomes possible to make power in one part of the country (say, in an protected nuclear plant in the desert) and literally ship the power to it's destination in trucks.
Want a flying car? 10x batteries make it possible. Modern sensors and CPUs make self flying cars possible, we just need 10x power density to make a flight longer than a few minutes. Everything is affected by power density: cars, smart homes, super fast rail, electric supersonic airplanes. Want to save the environment? 10x better batteries do it. Give the world clean water? 10x better batteries.
How can we put an MRI in your shower?
This may sound like a bit of an odd request, but it's another technology that would change the way we live. Many cancers can't be detected until they are big enough to have already caused serious harm. A tiny spot of cancer is hard to find in a full body scan, even with computer assisted image recognition. But imagine you could have a scan of your body taken every day. A computer could easily recognize if a tiny spot has grown bigger over the course of a week, and pinpoint the exact location it started. The solution to cancer, and so many other diseases, is early detection through constant monitoring.
Whenever you see your doctor with an ailment he goes through a list of symptoms and orders a few tests. The end result is a diagnosis with a certain probability of being true; often a probability far lower that you might expect. Constant full body monitoring changes this equation. Feeling a pain in your side? Looking through day by day stats from the last year can easily tell if it's a sign of kidney stones or just bad pizza you ate last night.
Constant monitoring only works if it is cheap, so cheap that you can afford to do it every day, and automatic so that you actually do it every day. One solution? An MRI equivalent built into your shower. When you take a shower a voice asks you to stand still for 10 seconds while it performs a quick scan. That's it. One scan, every day, could eliminate endless diseases.
As I said at the start. These are just ideas. They aren't prognostications of a future ten years from now. They are simply things we should be working on instead of building the next social network for sharing clips and watching ads. If you want to change the world, ask some bigger questions.
Thu Jan 24 2013
OSCON 2013: What do you want to see?
The call for proposals for OSCON 2013 just went out. OSCON is the one conference I try to speak at every year because the topics are so diverse and interesting. And being just up the road in Portland doesn't hurt either. However, I'm having trouble deciding what to submit. Too many things interest me. So I thought I'd consult the wisdom of the crowd. What do you want to see?
I plan to submit an update to my 3 hour HTML Canvas workshop since it is still a very relevant technology. It has been my most popular session, however, I don't just want to rehash what I did the last two years. What new things would you like to see? A game engine? Graphing algorithms? UI toolkits? More on Audio and Input?
Even if you don't plan to attend OSCON, the Canvas talk may still be relevant to you. The content from these sessions has turned into my popular open source ebook HTML Canvas Deep Dive. The book accounts for 75% of the hits on my blog, making it the most popular thing I've written by far. Any updates for OSCON 2013 will go into a new edition of the book.
I'm also open to other topics that fall within my area of expertise. The internet of things? Arduino hacking? Mobile app design? Usability? I'll even consider some Java stuff if that's what you want. What are your burning topics for 2013?
Mon Jan 07 2013
Super Christmas Adventure
Much like a painter or musician, sometimes I an idea
forms in my head and will not let me rest until it comes
out. Usually such an idea is an algorithm or graphics
demo, but this time it came in the form of a game;
a game which will not quiet until born.
To that end I present to you: Budu Budu Tiki Mon's Super
Christmas Adventure, an NES style RPG playable
in your browser.
I've always been a fan of NES/SNES era RPGs, the
Fantasy series in particular. Though fun
to play they are also easily parodied due to common
tropes through out the games. Each takes place in
a different universe with different characters, but
they always have a helper named Cid, a flying vehicle
of some sort, ridiculous weapons, twisting plots,
and backstabbing villans. As I said, ripe for parody.
And what better genre of parody than Christmas Movies
Silly characters, a princess to save, amusing dialog, and great
chiptunes (gratefully borrowed
from 8bit peoples).
This is just a prelude of a full game. SCA contains a small
overworld, two villages, and a dungeon. If there is interest
I'd love to turn it into a full game.
I want to stress that the prelude is in no way finished. The
game engine is rife with bugs, some characters are missing
dialog, and the graphics need further tweaks. There simply
wasn't enough time to polish it before the holidays. Such is
the life of a toddler father. Please
tweet me any issues
and I'll fix'em ASAP.
Have a Very Merry Christmas!
Mon Dec 24 2012