096 RR Topaz with Alex Gaynor

by Charles Max Wood on March 13, 2013

Panel

Discussion

01:30 – Alex Gaynor Introduction

01:54 – Combining Python and Ruby

05:57 – Websites that run on DJango

06:38 – Running Topaz

09:15 – TopazWroclove2013: Tim Felgentreff

13:44 – Compilation Errors

15:10 – Integration with Ruby

17:47 – Performance and Benchmarking

20:52 – Global Interpreter Lock

  • Waiting to implement features

23:26 – Design Goals

  • Simplicity
  • Performance

26:10 – Worst Part of the Ruby Language

  • Complexity of constant lookups
  • Number of different scopes

28:40 – Why Python, Ruby, and Javascript are Slow by Alex Gaynor

35:16 – Where is Topaz now/Where will it be in the future?

44:20 – Ruby features that people use wrong

46:23 – Blocks

Picks

Book Club

Patterns of Enterprise Application Architecture by Martin Fowler: You have one week left to finish it! Yay!

Next Week

Patterns of Enterprise Application Architecture with Martin Fowler

Transcript

JOSH:  And I’m also drinking my coffee from an Aperture Science mug.

JAMES:  Wow!

JOSH:  Yeah.

JAMES:  Wow! Alright, you win! You’re the geekiest!

JOSH:  [Chuckles] Well, right now.

JAMES:  [Chuckles] Yeah, right now.

[Hosting and bandwidth provided by the Blue Box Group. Check them out at BlueBox.net.] 

[This podcast is sponsored by New Relic. To track and optimize your application performance, go to RubyRogues.com/NewRelic.] 

CHUCK:  Hey everybody and welcome to Episode 96 of the Ruby Rogues podcast. This week on our panel, we have Katrina Owen.

KATRINA:  Hello from Denver.

CHUCK:  James Edward Gray.

JAMES:  It’s time to take off your gem-crested boots and put on your hat with that cool slick thing.

CHUCK:  Josh Susser.

JOSH:  Next time, make me go first because I never know how to follow anybody anymore.

[Laughter]

CHUCK:  Especially with gem-crested boots and a hat band, right?

JOSH:  Yeah, I don’t even know where that came from.

[Laughter]

JOSH:  But good morning!

CHUCK:  I’m Charles Max Wood from DevChat.tv. And this week, we have a special guest. And that’s Alex Gaynor.

ALEX:  Hi! I was not told I need clever introductions.

JAMES:  [Laughs] Well, you came up with one anyway.

JOSH:  Alex, I’ve been doing this for over a year and I still don’t have one.

CHUCK:  [Laughs] So, you want to introduce your self really quickly, Alex?

ALEX:  Sure. So, my name is Alex. I live in San Francisco. I am primarily a Python programmer. I work for a company called Rdio.com. We do streaming Internet music. And I guess I’m here because I wrote a thing called Topaz which is a Ruby VM written in Python.

JAMES:  [Chuckles] So, I guess the obvious question to follow is, why?

JOSH:  [Laughs]

ALEX:  Well, first and foremost, because it was a lot of fun. A friend of mine was building a PHP VM. He was doing that because he likes pain, I think. And I was sort of code ruining it as he went along and started to feel like, “Gee, writing a VM from scratch looks like a lot of fun.” All the VMs I’d worked on before had been big, existing things that I sort of came and worked on once there are already hundreds of thousands of lines of code. And building one from scratch looked like fun. So, that was the sort of the first and foremost reason.

And the second reason was I wanted to demonstrate that RPython, the language that Topaz and PyPy are implemented in is a really fantastic platform for building high performance dynamic language VMs.

JAMES:  So, are you far enough along when you know that you were wrong and it’s not time?

ALEX:  No, no!

JAMES:  Still fun?

ALEX:  Yes, it’s still fun after 10 months.

JAMES:  Wow! That’s awesome.

CHUCK:  Sounds like having a child, after 10 months!

[Laughter]

ALEX:  I won’t pretend that I’m having some angry yelling at my computer moments but it’s been a very fun experience.

JOSH:  [Chuckles] So, Ruby or Python?

ALEX:  I’m definitely still a Python [inaudible] at heart.

CHUCK:  Sorry.

ALEX:  Sure, I have now offended the entire audience of this podcast.

JAMES:  That’s okay. We forgive you.

JOSH:  De gustibus non est disputandum.

JAMES:  [Chuckles]

KATRINA:  One cannot discuss tastes.

JOSH:  Ahhh…thank you.

[Laughter]

JOSH:  Or the idiomatic translation is ‘There’s no accounting for taste’.

[Laughter]

KATRINA:  Sorry, I don’t know the idiomatic translation of what you’re talking about.

JOSH:  Well, doing the VM is impressive for any reason and doing it for fun is even doubly impressive. So, congratulations!

JAMES:  Okay. So Alex, you’re still a Python [inaudible], as you said. And you have Python right there which kind of is almost like a competing language with Ruby in many ways. They do a lot of similar things in different ways, right? So, what made you want to put Ruby on top of that?

ALEX:  I guess it never, at any point, occurred to me that this might be sort of undercutting my position as thinking Python is clearly the most amazing language and that everyone should use it. That never really occurred to me, which I guess shows a real lack of creativity on my part. I guess, I just don’t see this competitive thing. I think having more good VM implementations for all these languages which I prefer writing over C or Java or something. Having good VM implementations for all of them is worthwhile and doesn’t undercut the things I care about.

JAMES:  That’s cool.

JOSH:  Okay.

KATRINA:  It also doesn’t seem like Ruby and Python really are competitors. Like, they focus on different spaces. Like the Python community seems to focus a lot on the scientific sort of side of things even though with DJango, there is a good sort of easy web, I guess, framework that one can work on; whereas on Ruby, a lot of the focus is on web.

JAMES:  Yeah, you got a good point. And also, Python’s good for games and event loop programming and stuff like that.

ALEX:  I mean, before I was a compiler author, I worked on DJango and it’s basically my day job working with DJango to build Rdio. So, I guess I’ve always saw of Python as being very into the web space. But maybe, I guess, it is less the dominant player than web is for Ruby.

KATRINA:  So what other — just sort of out of curiosity. What other large websites run on DJango?

ALEX: Addons.Mozilla.org, DISQUS, the commenting tool. You should never — I feel I’ve tried and list the people using your software is a terrible idea, you will always forget some.

[Laughter]

JOSH:  Is Stack Overflow DJango or is it just Python?

ALEX:  I thought it was C#, to be honest.

JOSH:  Really? Oh, well there could be.

[Laughter]

[Crosstalk]

ALEX:  Presumably it’s Python, I had no idea.

KATRINA:  I totally derailed this conversation. Let’s get back to Topaz.

[Laughter]

CHUCK:  Yeah. I was actually going to ask that. What do you need to run Topaz on your machine?

ALEX:  So, we put up nightly builds for Linux 64 and OSX 64 systems. So, if you have one of those, you can just download it and run the binary directly. If you’re on some other platform such as a 32-bit machine or Windows, you’ll need to get a checkout of the Topaz source code as well as a checkout of the PyPy repository and then build it. If you look in the documentation, there are instructions on how to do that.

CHUCK:  So, I have to ask then, you said that you can get binaries for Linux 64-bit. Is Python standard enough across 64-bit for you to just give people binary and just have it work?

ALEX:  So, this is where the whole Topaz, RPython, PyPy, Python quadrangle, I think, gets a little confusing. Topaz is written in RPython which is a subset of the Python language. So, if you look at the source code, it looks completely like normal Python code which is true, it is. You can run all of Topaz on top of Python. In addition, a part of the PyPy project which is slowly becoming its own project, is the RPython language and a compiler. So, RPython is the subset of Python that we can run — you type inference on. So, it’s implicitly statically typed.

And then, we compile RPython down to C and we compile that to, obviously with GCC down to assembler. And so, the end result of this compilation process is RPython programs can be turned into a single binary. And yes, they’re, in theory, distributable.

JOSH:  That sounds a little bit like the Squeak language layering.

ALEX:  I’m sorry, Squid?

JOSH:  I’m sorry, Squeak. [Chuckles]

ALEX:  Oh, Squeak.

JOSH:  Yes, the Smalltalk Virtual Machine that was written in Smalltalk.

ALEX:  Yeah, I definitely share a lot of ideas, obviously the language sort of being written in itself. I guess, the thing I most know Smalltalk for is the image design which this doesn’t share.

JOSH:  Yeah. So, Squeak is pretty much what you described as, it’s a Virtual Machine that’s written in a subset of Smalltalk that can be easily translated to ANSI C and then compiled into a binary of the VM.

ALEX:  Yeah, very similar design.

JAMES:  There is a great slide deck you sent to us that kind of goes through these different parts – the Ruby Interpreter, the Topaz part, and then the translated C and all that. And it talks about how you say — and maybe this is actually worth discussing a little bit on the show. And that slide deck actually is somebody else’s slide. But it says, “Topaz is not a VM that runs Python and Ruby,” which is kind of what we expect when we hear something like this, right? You have JRuby and it runs Java and Ruby, right? Tell us what that all means.

ALEX:  Yeah. First, the slide deck was written by Tim Felgentreff who worked on GemStone which is a project to put Ruby VM on — sorry, the project is Maglev which is an attempt to put Ruby on the GemStone Smalltalk VM. And he’s been helping out with Topaz for quite a long time by now. Which basically, the idea there is when people hear Ruby in Python, depending on where they come from, they either want to be able to somehow use Ruby libraries in Python or use Python libraries in Ruby. And that’s not really the case here. Basically, the design is the whole idea of the RPython translation compilation framework tool chain – I’m never going to refer to that consistently. But the whole idea is that current models of sort of having a single VM design and having a single just-in-time compiler don’t really work well. So, that was our experience sort of looking at the design of Parrot, of looking at the JVM. Someone on the mail has once really succinctly wrapped this up. Once you’ve decided on a Bytecode, an object model, or any of these other details, you basically design what languages your VM is going to have first class support for where the ANSI languages is really superfluous.

So, the idea of RPython is you write your VM basically like you would in any other tool kit, writing a VM from scratch. So, a VM written in RPython has a lot of the same design characteristics as one written in C. You write a lexer, you write your parser, you have an AST then, your AST, it’s compiled to Bytecode, and then you write an interpreter for your Bytecode. And if you look at the Topaz source, that’s basically what you see, if you look at interpreter.py or parser.py or any of these other files.

And so, from there basically, we just have an interpreter. We don’t have any basis for doing interop with any other language including Python, at least not out of the box.

One of the benefits of sort of building on these framework is reduce your common just-in-time compiler, reduce your common garbage collector. And one support for doing sort of external modules that are written in RPython but not distributed with the VM is sort of more. So, RPython does not have good separate compilation support. Everything that’s part of the interpreter needs to be built at once.

If we had that, you could, in theory, turn Topaz into a module you can load in Python. Or turn PyPy into a module that you could load in Topaz. But those were never the big design goals for this project.

JAMES:  So, what’s cool about what you just said, in my opinion, is basically that you’re getting to write this in Python, you’re not having to write it in C. But in the end because that’s how it compiles down to a binary instead of your getting really good performance, right?

ALEX:  Yeah. So, RPython is a very interesting language. It’s incredibly inconvenient in many ways. The error message, the compilation error messages you get are terrible, and it’s generally under-documented. But it is fairly fantastic for writing virtual machines because that sort of what it was designed for. And the very, very key part of the whole process is sort of the automatic generation. So as a part of compiling in RPython program, if you insert a few hints into the source code about how your interpreter works, and we’re talking very small hints, a few lines of code, RPython is able to automatically generate a just-in-time compiler for you.

KATRINA:  So, you mentioned that the compilation errors are horrible, in what way?

ALEX:  They tend to be long, just lots of text for simple mistakes. They tend to not point you at the place you made a mistake. So, you’ll pass something of the wrong type in one place and then, you’ll get an error in an entirely different file in a function that you don’t even think is related. And part of that, it has to do with just the way the type [inaudible] works. It’s not like someone wrote bad error messages. There’s a fundamental design consideration that doesn’t really go well with good error messages, as far as we’ve found.

CHUCK:  So, I want to just kind of roll back a little bit. You made it sound almost like you could write a Ruby program in Topaz and then compile it. I may have misunderstood that.

ALEX:  No, sorry. This is another one of the unfortunate confusing bits. No, Topaz is just a Ruby VM.

CHUCK:  Okay.

ALEX:  It has sort of the exact same running pattern as CRuby or Rubinius. You get a binary, you point it at a Ruby program, it runs. There’s no separate compilation step.

CHUCK:  Alright.

JAMES:  It just mended that the Python, RPython translation ends up compiling Topaz down to the C binary.

JOSH:  Alex, I have a couple of questions about sort of the integration with the rest of the Ruby world. And the first one is, are you paying attention to RubySpec and is the VM like passing a bunch of the RubySpecs?

ALEX:  Yeah, absolutely. We’ve been using RubySpec as our primary test suite for a quite a while. And yeah, it’s an amazing project to work with just having all of Ruby documented sort of method by method of what it needs to do. I think right now we pass 5500 specs or something like that or we’ve passed 5500 expectations maybe. So, we’re passing a lot of them and still lots and lots more to go.

JOSH:  Well, that’s cool. And I’m curious about if you’re making any use of Rubinius technology. And that’s a big question so I’ll refine that a little. But Rubinius is meant to be like Ruby written in Ruby and so a lot of more of it is in Ruby code than in C code. So, that’s less C to translate into RPython. So, I’m guessing that maybe you’re using some of the Rubinius stuff to reduce the amount of work you have to do.

ALEX:  We are not using it yet. That’s definitely — it’s an open ticket to figure out how we can integrate with Rubinius’ kernel which is what they call sort of the pure Ruby implementations of things that are in CRuby-C. Essentially, yes, we’d love to. We’re already writing a lot of things in Ruby that are in CRuby-C. So, we’re trying to write quite a bit of Ruby code and we’ll ultimately love to sort of share some sort of some amount of the common code with Rubinius. We’re not quite there yet. Rubinius, obviously, is a much more complete VM. So, some of their kernel relies on features that we haven’t implemented yet. But it’s definitely a long-term goal.

KATRINA:  So, how do you write — let’s say Topaz is written in RPython. So, it has to be in Python, right? For it to be able to compile down to C. Where — at which level do you write the Ruby?

ALEX:  So, the Ruby is just normal Ruby. If you look in their repository, there’s directory called lib-topaz. And essentially, when the Topaz binary starts up, it just loads all the files in there to get implementations on all those methods.

KATRINA:  Cool. So, that doesn’t get compiled down at all?

ALEX:  No. If you check out the binaries, we distribute this lib-topaz directory right in there.

KATRINA:  Neat.

CHUCK:  I have to wonder too. A lot of these alternative VM implementations like Rubinius and that’s kind of what brought it to mind. They claim that certain areas of Ruby performed better on their VM than in MRI which — I mean, I totally get that it probably would work that way from one VM to another depending on what their focus is and what they get out of the way that they built it. Have you done any performance or benchmarking on Topaz versus like MRI or some of the other implementations?

ALEX:  So, I’ve done some very limited benchmarking and basically my experience, Topaz is almost across the board faster than all the other Ruby implementations.

JAMES:  So, let me ask you there, though. I know from looking at that slide deck I mentioned earlier that you don’t yet have some of the key components or at least, at that time that deck was made like eval and set_trace_func and stuff like that.

ALEX:  I think that slide was list of things we do have. We definitely have both set_trace_func and eval.

JAMES:  Oh, cool. Yeah, you’re right. It was the next slide. Fibers, FFI, am I right now?

ALEX:  Yeah.

JAMES:  Okay. So, is that part of the reason for the speedup that you don’t have everything that Ruby has yet or know that Ruby is going to be fast?

ALEX:  No. Basically it really will be fast. So, Fibers are basically similar to what in Python are called greenlets, they’re green threads basically. And basically, the only reason we don’t have those is I’m waiting for a branch to land in RPython which makes support for them a lot cleaner. They should be straightforward and not have a performance impact on the rest of the VM, same with FFI. FFI is similar to, in Python land C type or CFFI which again, will be relying a lot on what the RPython support for those libraries is.

JOSH:  And the Ruby language version you’re targeting is like 1.93.

ALEX:  1.93, yeah.

JOSH:  I’m curious, is this a bytecoded VM?

ALEX:  Yeah. It’s a bytecode VM. The design is superficially pretty similar to the Python VM actually.

JOSH:  Okay. So, like Python or PyPy has instruction set, bytecode set and you’re basically using the same bytecodes?

ALEX:  No. So, that’s sort of one of the details of RPython. RPython does not enforce a particular bytecode on you. I happen to choose a design that was very similar to the Python bytecode. It’s not necessary though.

JOSH:  Okay. Did you consider looking at the bytecodes? I think 1.9 has a bytecode set internally and Rubinius definitely has a bytecode set.

ALEX:  I didn’t look at the 1.9 bytecode. I did look at the Rubinius bytecode.

KATRINA:  Is there a problem with the global interpreter lock? Or does RPython just not have that problem?

ALEX:  There is no global interpreter lock in Topaz. There are also no threads right now.

CHUCK:  There you go, problem solved!

[Laughter]

[Crosstalk]

ALEX:  Yeah, right now, PyPy does have a global interpreter lock because RPython’s garbage collectors are not thread safe basically. But Armin Rigo who was the creator and one of the lead developers of the PyPy project has been working on Software Transactional Memory support in RPython to basically address this. And for now, I’m just going to sit it out and wait to see how that work goes before adding threads.

CHUCK:  It sounds like you — we’ve asked a lot of few things and you’ve mentioned that you’re waiting for RPython or PyPy to kind of come to the point where we’d make some of these things easier. And with some open source projects, you can pretty well count on if they say they’re going to do it, you’re going to get it in a reasonable amount of time and then other projects, it’s not always that way. And sometimes, these things wind up being way harder to implement than they think they are. And so, you wind up waiting a long time for that.

I’m just wondering, how long are you willing to wait for some of these before you go ahead and try to implement it on your own anyway?

ALEX:  I guess I don’t have a fixed timetable. I work on RPython so I work with the people who are working on these projects. So, I have a good sense of how long these things are going to be so that the Fiber support I’m waiting on, for example, is we’re hoping to have it merged into master or default [inaudible]. We’re hoping to have that merged by Pycan. So, coming up in like a week and a half.

STM is a bit of a longer term goal. But I think threads as a whole — I think there’s a lot of other pieces you’d want us to have of Ruby first before you came asking for threads. I mean, it is possible to have threads just with the global interpreter lock. That’s what PyPy has now. But it’s just — my view was if there are going to be any design changes we want to make as a result of having possibly a true multithreaded interpreter, we may as well wait until we know what that’s going to look like. And Armin already has prototypes of the STM work that basically show yes, we’re able to get a linear speed up if you have enough cores. And now, he’s working on bringing on the number of cores you need to get the speed up and working on the git integration.

KATRINA:  I think you already mentioned the interoperability between like Python and Ruby wasn’t one of the design goals. What are the design goals?

ALEX:  The primary design goal was, first of all, simplicity being something that someone with already huge VM background could sort of try to read. And the other design goal is performance. We really wanted to demonstrate that RPython was a platform for building fast VMs.

JAMES:  Alex, how big is the project currently?

ALEX:  I think we’re at something like 20,000 lines of RPython and a couple of thousand lines of Ruby.

CHUCK:  How many of the features or these newer features in RPython and PyPy are being driven by this project as opposed to just anything else that relies upon it?

ALEX:  So, I wouldn’t say any of the features I’ve talked about so far are being driven by this. But probably the biggest thing is if you’d asked me six months ago, basically RPython was a sub-directory in the PyPy repository that was — it wasn’t totally an implementation detail of PyPy like we knew it was its own project that could stand alone but it wasn’t treated that way. The docs were shared. They’re in the same directory. And now, we’re sort of working on really splitting those out into separate repositories that have separate documentation, separate tests, fully sort of independent projects. PyPy will use RPython the same way Topaz will use RPython. Everyone will be a first class citizen, basically. And that’s definitely been driven a lot by Topaz and some other VMs that have sprung up using RPython.

CHUCK:  So, I’d also like to know. It seems like with a lot of the other implementations like JRuby, I know that IronRuby was a thing and I think it is a thing again. Some of these other implementations are basically not just other implementations of Ruby but are actually ways of getting Ruby into those infrastructures where people rely on other things like .NET or Java. And so, since this compiles down, it doesn’t actually have ways to hook into Python. I’m a little curious if you considered making that possible, or is that just something that you can’t do with RPython and PyPy?

ALEX:  Yeah, that was never really a design goal for Topaz. I think someday, it will be possible to either to use Topaz to embed Python in Ruby or the reverse. But not like a priority on any sort of timescale, I can see.

JOSH:  Okay. I have a little bit different direction here. So now that you’ve gotten your hands dirty on the inside of Ruby, what’s the worst part of the language?

ALEX:  Single worst part of the language? I would say the complexity of constant lookups.

JOSH:  Oh, yeah. [Chuckles] Okay.

ALEX:  Or more broadly, maybe the number of different scopes.

CHUCK:  Can you explain that just a little bit? I know some people are at different levels. I’m not 100% sure what you’re — where the issues are either.

ALEX:  Basically, my experience is that Ruby has many different types of sort of variables. There’s $global_variables that start with the $, there’s Constants that start with an upper case letter, there are local_variables that are lower case or start with a lower case letter. And there’s attributes with — sorry, @instance_variables which live in one name space and methods which live in another.

JAMES:  @@class_variables.

ALEX:  Yeah, @@class-variables as well. So, it feels like there’s many different sort of name spaces or scopes in Ruby. And coming from my — you know, Python, I think, is a more simplistic model for better or for worse in that there are basically attributes, local variables, closures, and global variables. And they all — yes, attributes look like one thing and then variables of global or closure local scopes sort of look the same. It’s really a very different aesthetic.

JOSH:  So in Ruby, you know, like the instance variables that you get in a class or in instance, are they’re very lazy. They’re very much — you add there the name of an instance variable and suddenly it’s there in the instance. And there’s no like pre-declaration of those things in the class definition like there is in Smalltalk or C++. How is it in Python? Is it you have to create or include these things when you define the class or do they just come into being as you mention them.

ALEX:  There’s no declaration in Python. It’s basically — attributes are basically similar to instance variables. The one difference being in Python Xing an attribute that doesn’t exist raises an error. In Ruby, you try to get an instance variable that doesn’t exist, you get a nil.

JOSH:  Okay. But that dynamic allocation of those things is part of the instance is pretty similar?

ALEX:  Yeah, exactly.

JAMES:  Continuing along these lines partaking a different direction, when we were getting ready for this show, we looked over another slide deck of yours, this time I’m sure, Alex, where it’s called the ‘Why are Python, Ruby, and JavaScript slow’. And it’s a really fascinating slide deck. I highly recommend people to look there. And basically, it amounts to you show how in C, the typical pattern is, you allocate some buffer and then you call a bunch of methods passing that buffer along that just collects the data as you go. Whereas in Ruby or Python or JavaScript, all these dynamic languages where the language takes care of growing things for you and stuff like that where if you write the similar idiomatic code, you get this tons of allocations under the hood and stuff and that’s why they end up being slow. And you do that as being in defense of things like dynamic typing and stuff and saying, “We can handle that stuff. That’s not the problem.”

Anyway, my question is, you said in that slide deck, you’d really like to see the APIs expanded to allow for this like low level kind of stuff where speed really matters. There’s no reason we couldn’t have an API where we tell how big of an allocation we’re going to need here. So, it doesn’t have to be dynamically done and stuff. And I’m wondering if you created Topaz and you’re pushing that as part of trying out this agenda to add the APIs to it and see what can be done.

ALEX:  So, I would love it if Topaz could become part of sort of a movement, I guess, for lack of better term, towards having these APIs. But no, I want Topaz to be an implementation of Ruby where if you write some Ruby code and run it on Topaz, you can run it everywhere. It’s not my intent to embrace and extend Ruby.

JOSH:  [Chuckles] Nice! Nice terminology there.

[Laughter]

JAMES:  You know, Ruby does have some of that. Like as I was reading through, I was kind of thinking about what you were saying there like the array constructor in particular…

[Crosstalk]

ALEX:  Yeah, array.new takes the size which is great. Python doesn’t really have that. So, it’s definitely…

JAMES:  It’s interesting, though. You’re right like array.size is great and that you can pre-allocate the size you need, if you know it’s going to be some kind of fixed thing or something. And there’s no reason we couldn’t extend some other methods to work kind of along those lines and give ourselves an even richer API.

JOSH:  Well, we already have some methods that are mindful of allocation. Like the bang methods in the string class, like you have reverse bang.

JAMES:  Right. And then the new lazy evaluators do not need the arrays between. They just wire themselves to each other like a pipeline.

JOSH:  Yeah.

ALEX:  Yeah. Yehuda Katz was showing me the lazy stuff and I was really impressed. Those look like they could be a very powerful API for — not giving up, frankly, the things that I like about Ruby and Python but still getting me this performance that I want.

JOSH:  Yeah. Well, okay. The less said about lazy enumeration right now, the better.

[Laughter]

[Crosstalk]

ALEX:  I guess I’m not familiar enough with it.

JOSH:  Yeah, I have Ruby with it but let’s not get distracted with that now. So, I think that the presentation beside that was really interesting. I’m sorry I missed the talk at WAZA. But I think that you have to be really careful when you start getting into things like pre-allocating memory buffers because Ruby, it has automatic garbage collection. And you need to be careful that you’re not doing stuff that’s going to break the ability of the language to manage the words for you.

ALEX:  Yeah. No, I totally agree. I was sort of — I don’t know how well it was showed in the slides. But in the talk, one of the things I wanted to make very clear is I don’t want to give up how I write Python generally. I don’t want to give up sort of all these things I like about the language. I don’t want to give up GC across the board. I like having the garbage collector. I just want the ability to give small hints that I know matter. Because right now, even as we have things like PyPy, like Topaz, like Rubinius, like JRuby trying to push the speed of these languages forward, a lot of the problem is you’re writing these languages and someone will tell you, “Oh, you’re still three times faster — or three times slower than Java.” And basically, I would like to be able to say, “We will write normal Python for 99%of our application with normal Ruby or whatever.” And then, when your boss says, “Hey, this part of the thing needs to be really fast. It’s really important, it’s really CPU-intensive,” whatever it is, be able to say, “Okay, we will continue to write that in the same language. We will just be a little more conscious,” because I don’t see Python or Ruby as being inevitably slower.

I see it as, we’ve made some design choices along the way that certain APIs are so convenient and even if there’s a small performance cost, they’re worth it most of the time. But I’d like to be able to have alternatives for the small places I care about. And I’d like to have sort of a culture where people care about having these APIs available so that their libraries aren’t just O except for if you need to be fast.

JAMES:  I don’t think that’s actually against the spirit of Ruby. Like we have things like the O modifier on a regular expression that says compile once. And I’m in the habit of tacking that on when I use something like interpolation for some static value. Like the other day, I had a strip of BOM off of some data. And so, I just did the escape sequence and I start getting a string and then force the encoding the — write the encoding and interpolated that in my write X instead of trying to figure out what the write X to build was. But then, I dropped O at the end of the write X so Ruby wouldn’t compile it over and over again. So, we kind of have that in some areas.

CHUCK:  I’m a little bit curious as to — two things. First off I’m going to ask, what are you working on with Topaz right now? Where is it going to be in the near future?

ALEX:  I would say, right now, one of the biggest things I do day to day is I review a lot of pull requests. I’ve been so amazed with the amount of help I’ve gotten since Topaz went open source with adding new methods, fixing bugs. Just across the board, I’ve been so pleased.

CHUCK:  Can I interrupt you real quick?

ALEX:  Sure.

CHUCK:  Are those mostly coming from Ruby developers or Python developers?

ALEX:  I think we’re getting some of both which is really exciting. It’s definitely awesome to see, “Hey, I’m a Python person. I haven’t done Ruby in two years. This is my best attempt at adding this feature.” And, “I’m a Ruby programmer. I’m not sure I understand the Python code perfectly but here’s my attempt.” We’ve definitely seen both of those and it’s pretty awesome.

CHUCK:  Yup. Anyway, back to where you’re going to be in the near future.

ALEX:  Yeah. Probably, the next big feature I want to start working on is FFI. Topaz will probably never support the Ruby C API. And so, we need a way to buying C libraries for the things like database adapters and whatever else. So, I think FFI has good support from all the Ruby VMs right now. I know JRuby, Rubinius, CRuby all have it. So, going towards that is probably going to be the next big feature. Also obviously, just continuing to target more specs.

One of the things we really appreciate, if you try to run your Ruby code on Topaz and you got a no method error or something, let us know what methods are missing for your code because the more we see we’re missing something, just the higher we’ll prioritize it.

CHUCK:  Right. So, it seems like there are some milestones for Ruby implementations. One of them is implementing a certain amount of RubySpec. It seems like another one that people talk about, depending on how involved they are with Rails, is whether or not it will run Rails. If I put a Rails app and try and run it on Topaz, is it going to work?

ALEX:  No. [Chuckles] Not by far.

JAMES:  That’s the endgame, right?

ALEX:  Yeah. Rails is like the holy URL. I think if you run Rails, they’ll run most things. No, sort of at the most basic level, there’s no way you’d be able to put anything on the Internet because we don’t have a socket module. So, that seems kind of a really important place to start on a more basic level. I’m sure we’re missing sort of so many methods along the way. I can’t even imagine what Rails would fail on first if you try to run it.

CHUCK:  My last question until I think of more, obviously, is, you’re now part of this Ruby Implementers Club. So, you and Charlie Nutter are like bros, or how does that all work?

ALEX:  Yeah. No, in building Topaz, the cooperation of Charlie and Evan Phoenix and Brian Ford from Rubinius has been fantastic. They all help me so much with the questions about the language, places that they felt were — places that were really conscious of performance, stuff like that.

A few months ago, Charlie published a blog post. I can’t remember what it’s titled. But it’s basically things that in Ruby that you have to get right in order to say that you’re a legitimately classed implementation and it’s not just, “Oh, you’re missing things.” And that was basically from an Email he sent to me about that very same question. I wanted to make sure I knew all the places in Ruby that we had to get right before it would be honest to do a benchmark. So yeah, working with them has been fantastic.

KATRINA:  So, how much do you want to implement before you declare as first stable version?

ALEX:  That’s a really good question. I’m a big believer that releases should just be nightly builds with the version number bumped. So, I’m really please we have a pretty good release infrastructure right now. Every time the test ran on Travis, we upload the build.

To actually put out a version number, it’s a really good question. I would love to say something like, “It runs Ruby gems,” or something. Something like, “Hey, you could maybe, in theory, try to run something that kind of looks like a real program.” But right now, I don’t know all the scope of how big it is to get Ruby gems running. So, maybe at least something smaller like as soon as we pass 10,000 RubySpecs or something.

KATRINA:  Alright.

CHUCK:  So, you mentioned running Ruby gems. How much of your development is driven by, “We want to be able to run X,” versus maybe, “We need these features like FFI.”

ALEX:  So, for a very long time, the development of Topaz was basically spurred by, “Okay, what’s the next feature? We need to run MSpec.” MSpec is the test runner for RubySpec. And it requires a decent amount of Ruby to get going. So yeah, that was definitely sort of where I started and I imagine will be coming back towards that direction very soon with Ruby gems or Rails or Sinatra or whatever it is. Finding a real Ruby program and just sort of finding one of the methods we’re missing.

JOSH:  Hey, where did you start? Like what was the first spec that you got to pass?

JAMES:  One plus two.

CHUCK:  [Laughs]

ALEX:  The first spec was probably like fixnum_even or something. But that first spec came six months, maybe more, after I started. The very first thing I implemented in Topaz was sort of adding, I think, very basic parsing. So, originally the — for a long time, the Topaz parser was sort of a homegrown thing and that which parse some subset of Ruby. I want to say within the first week, though, I had a real interpreter and sort of really starting on the object model, really started on calling real methods. You’d have to go back in the git history. It’s all there to see how it evolved.

JOSH:  Right. And how are you doing the actual parser and compiler?

ALEX:  So, the parser is basically a — there’s a library in Python called PLY which is Python Lex-Yacc, lexing/yacker, pretty common Unix parsing tools. And basis is Python libraries supported them to Python. I created RPLY which is a port of that to RPython. And basically, the grammar is a direct port of MRI and JRubies which are both very similar to this.

JOSH:  I haven’t dug in to the code for it. But my understanding is that the parser for Ruby is kind of complex.

ALEX:  Yeah. The parser for Ruby is definitely the most complex I’ve ever worked with.

JOSH:  Yeah. I guess there are parts of the language that aren’t even context free.

ALEX:  Yeah. It’s pretty subtle. So, if you see F space left bracket right bracket, is that calling the method F with a hash or is that invoking F with an empty block?

JOSH:  [Chuckles] Yeah.

ALEX:  If you see A +B, is that invoking the method A on positive B or is that A+B?

JOSH:  Right. And then, there’s HEREDOCs.

ALEX:  Oh, yes. It’s a great fun.

CHUCK:  [Chuckles]

JOSH:  Yeah.

JAMES:  Dave Thomas has some great talks he gave a couple of years back where he would just play with like a subset of Ruby, he would put up and ask, “Do you think do they turn out to be…” It just looked like letters and slashes but it was actually a regular expression that looked like modes. And then, he would just start making subtle changes like, “Move this over, add another token. Put in a space here.” It was almost schizophrenic what Ruby would do each time. I mean, like, “Oh! That’s alright guys.” “Oh! That’s division.” “Oh! When I put a space in here, it’s a syntax.” Everything is just great.

JOSH:  It sounds like Gary’s Wat talk.

JAMES:  Yeah, it’s a lot like that. Yeah.

ALEX:  Yeah. And this is compounded by — I have friends who, for them, the fun part of writing a compiler is figuring out the parser. For me, it’s not. For me, a parser is an annoying thing you have to write before you get to the fun part. [Chuckles]

JOSH:  It’s all about code generation.

ALEX:  Exactly.

KATRINA:  So, did you team up with these friends, get them to do [inaudible]?

ALEX:  Yeah. I worked with a number of friends who enjoy this field a whole lot more than I do. That’s why I ultimately — we started with our own homegrown thing but really moved to just a port of JRuby and MRIs.

KATRINA:  Right.

CHUCK:  So, what is the feature in Ruby that you have to implement in order to make Ruby work but that nobody really uses or knows about or uses wrong? That’s acceptable as well.

ALEX:  I think using things like break and next inside of a block. They’re probably not very frequently used in there incredibly subtle because in my experience, also something like rescue — sorry, not rescue, retry. I don’t think I’ve really seen retry used in real Ruby code but there is.

KATRINA:  I use retry.

ALEX:  Okay. Show us so I’d know.

JAMES:  I use it too.

CHUCK:  I don’t even know what it does.

JAMES:  You basically lets you go back to the top of a begin and start over. Like it’s like next, only you don’t move on to the next iteration. You’d retry the current iteration. Really the only place I have ever seen it use and the only place I have ever used it for sure is in a rescue block where I might want to — maybe I’m making a remote call and I’m willing to try it three times or something like that or try it with next exponential back up so do a sleep and then retry or something. That’s where it comes in mostly and in my opinion is error rescue.

KATRINA:  That’s where I’ve used it as well. Though, I go to 25 not to three.

JAMES: [Laughs] Wow!

CHUCK: [Laughs] Why are you tweeting me 25 times?

JAMES:  I know. Now, I feel like all my code is not as good because I don’t hammer that service.

KATRINA:  It was desperation. Not anything smart under my part.

[Laughter]

JOSH:  Desperation is the mother of cursing. I don’t know.

[Laughter]

CHUCK:  Get in there! Get in there! Get in there! Get in there!

[Laughter]

JAMES:  Don’t give up! The project sounds awesome, Alex. It’s really cool.

KATRINA:  Definitely.

ALEX:  Yes.

JAMES:  Yeah. What a fun project.

JOSH:  I actually have one last brutal comparison question. And that’s in implementing Ruby, have you learned anything or have you discovered anything in Ruby that you really want to have in Python?

ALEX:  Yeah. I discovered this a little before I started the project from speaking with Gary Bernhardt. But I would — I absolutely love Blocks. It’s just that every time I try to sit down, write how would Blocks work in Python, I find things that make me sad. So, I’m deeply afraid I’ll never get Blocks in Python.

CHUCK:  [Laughs]

JOSH:  They’re really hard. [Chuckles] It’s crazy to make them work right. But they are like one of the most awesome parts of the language.

ALEX:  Yeah. They’re so useful in just crafting great APIs.

JAMES:  Agreed.

CHUCK:  Yeah. They definitely make your code cleaner, if not more — what’s the word? Usable, I guess.

JOSH:  [Laughs]

CHUCK:  They do a lot. I don’t know what the word is I’m looking for. Anyway…

ALEX:  Functional?

CHUCK:  Functional? Yeah.

JOSH:  That’s a pun. Okay, so…

[Laughter]

JOSH:  Okay. So, have we exhausted this topic?

JAMES:  Let’s do some picks.

CHUCK:  Okay. Let’s picks. Katrina, what are your picks?

KATRINA:  Wait for it. Digital telepathy. Isn’t that awesome?

CHUCK:  Totally awesome.

[Crosstalk]

KATRINA:  They made electronic temporary tattoos where it picks up brain signals. And so, you can basically talk to someone who also has one of these without talking. And they’re working on making it so that you can like fly in the airplane and all of that cool stuff as well. And it has useful applications for like putting these on pre-term babies so that you can detect the onset of seizures. You know, real stuff as well. But digital telepathy!

CHUCK:  That’s awesome.

KATRINA:  That’s my only pick.

JOSH:  Well, that’s the only one you need, right?

[Laughter]

JOSH:  It’s awesome.

CHUCK:  James, you’re broadcasting again.

JAMES:  [Laughs]

JOSH:  Why are you thinking about pancakes?

[Crosstalk]

JAMES:  I don’t know why I think about pancakes.

CHUCK:  Yeah. I’m just thinking about all the times that my wife’s mad at me and I’m oblivious. I just — I don’t know if I want this!

[Laughter]

CHUCK:  Anyway. James, what are your picks?

JAMES:  Okay. I’ve got a few but I’m going to run through them really quick. I basically rediscovered the IOS recently and I like really got into it and played with a bunch of apps and stuff. And one of the things I wanted was a good text editor for just like little edits I have to do on my iPhone. And there’s an awesome comparison of text editors at [inaudible] modes that tells you like everything you can possibly imagine about them from where they store their files to how they handle markdown and syntax highlighting and stuff.

Two cool editors I found because of it are Textastic which is a pretty nice for editing a Ruby file on your phone, if you need to do that. And the other one I really like is Nebulous Notes which is great if you want to do markdown on your phone. It also has an awesome macro system which means you can get pretty fast even with a touch screen keyboard which is cool.

And in rediscovering the IOS, of course, I’ve been finding a bunch of games. There are some great lists out there like 25 Best All Times Games or the 50 Best of 2012. And I used both of those. I found several great games off of it. But just the one I’ll throw out here is a lot of fun. It’s Outwitters. It’s a strategy game that it seems nobody in the world has discovered. But it’s a turn by turn strategy game, it’s a total blast. So, if you enjoy that kind of thing, you should check it out.

And then, I should probably just also say that the Skeptic’s Guide to the Universe last week talked about digital telepathy and they said, “No way!” That’s it.

KATRINA:  Dang it!

CHUCK:  [Laughs] Alright. Josh, what are your picks?

JOSH:  I’m going to start with the thing that made my life most awesome in the last week. And that’s there’s been a little discussion recently about Heroku and the router and queuing requests and performance hits on your application. So, I looked at my application which is running on Heroku and it’s still early days in my application. So, it’s no Rap Genius level of traffic. But I noticed that I was getting hit by the request queuing and I did the incredibly simple thing of just moving all of my static assets to S3 which made my app much more responsive. Yes, it’s kind of DOM serving static asset, software [inaudible] of the dyno. So, this is obvious.

But there’s this great gem called asset_sync that I used to do it and made it completely trivial. It took me very little effort to move all my assets to S3. And I recommend everybody who’s doing Heroku for their application to do this. So, that’s the asset_sync gem.

And then a fun pick is, a couple of months ago, I mentioned the Powers Comic which I was rereading all the graphic novels for it because the new comic was coming out. And yes, they’ve restarted the comic. So, I’m picking it. This is the second part of the storyline, I guess. And they’ve now ascended from being policemen to going up into the big leagues. So, it’s a — whatever that means, I won’t give it all away. But the title is Powers Bureau which apparently they couldn’t call it Powers FBI for legal reasons. So, there’s that.

And then, I have kind of a fun geeky pick which is strftimer.com. And I’ll just leave that there because it’s worth checking out. It’s really fun, and actually kind of useful. And that’s it for my picks this week.

CHUCK:  Alright. I got a couple of picks. The first one is, I used to use a program called Teleport on my Macs to allow me to move my mouse and use my keyboard across multiple machines. And it hasn’t worked since I upgraded the Mountain Lion. I don’t know why. I don’t really care anymore. I just gave up on it and moved over to Synergy which was something that I used a long time ago.

The thing I like about Synergy was that it works cross-platform which Teleport didn’t do. But the other thing is that it has a configuration in it now where you actually just tell it where you want each machine relative to the other machines and it just works. The way that it worked before is you have this config file. So, it’s like if I go all the way to the left on this machine, then I should wind up on this other machine. And then, you config things for when I move up from that machine, I should hit this other machine. And you know, it’s kind of a pain in the neck. Well now, you just click, click, click, start and you’re done. And so, I really, really like it. So, I’ll put a link to that in the show notes.

The other one that I found this last week is something that one of my clients — I picked up a customer from another freelancer who kind of changed directions and was referring out his leads or his customers. And so, I picked up this customer. And they were using this gem called astrails/safe to backup their databases. And I just thought that was way cool. So, I’m going to pick that as well. It just does a quick mysqldump or a dump of postgre SQL. It does some other stuff too. But anyway, you can backup to local files, you can backup to Amazon, all kinds of stuff. And so, I was pretty impressed.

So, I’ll pick those and we will let Alex give us his picks.

ALEX:  Alright. So, first pick is a book called Fair Play by Steven Landsburg. It’s kind of an interesting economics and sort of life book that I enjoyed.

Next pick is a Python library called Werkzeug. I’m sure I’m mispronouncing that, it’s German. But it’s a Python web library. It’s got a really fascinating design. It’s not a web framework like anything I’ve ever seen before. It’s really a library. So, you write your application from end to end. And if you want, you instantiate its request object or you instantiate its response object. And I just think it’s a really fascinating design that everyone should check out.

And last pick is a textbook I read while I was in College called ‘Gender Through the Prism of Difference’ which I thought was really awesome. It’s got some really fantastic essays in there.

CHUCK:  Nice!

ALEX:  So, those are my picks.

CHUCK:  Awesome! Well, thanks for coming, Alex. It’s been fun talking and there are definitely some interesting things that I learned, at least, about building a VM and some of the stuff out there. So, thanks for coming.

ALEX:  Thank you so much for having me.

JAMES:  Also, you have scary hobbies.

CHUCK:  [Laughs]

JAMES:  Just saying.

ALEX:  They’re not so scary.

JOSH:  He’s building a VM for fun. What do you expect?

JAMES:  No, I think it’s really cool. I do. It’s really awesome.

ALEX:  [Laughs]

CHUCK:  They’re not so scary once you get to know them.

ALEX:  Exactly.

0 comments

Previous post:

Next post: