Rust Browser Part 3: A Long Slog for Small Features

I had hoped to be talking more about how to build a browser, but reality has intervened. It’s taken about a week, but the family is starting to calm down now and get used to the new normal of staying home. I’ve stocked up on supplies and prepped for exclusively working from home. Jesse is recovered from pink-eye and a cold, and we’ve scrubbed the house clean. Now all we can do is wait and try to help others as best we can.

In the full code you can see I’ve implemented and reimplemented the inline layout flow several times. I think I’m finally close to getting it right. I’ll have to cover that in another post. Today I’ll talk about the process of adding a single tiny new feature to the browser. A process repeated over and over.

Browsers are hard

When I started this project I made a list of major features like block layout and parse stylesheet. That’s how it went for the first couple of weeks. I would hope that once those features were done it would seem like the code is finished. But of course it’s not. A feature like /parse stylesheet/ isn’t the sort of thing that can be covered by a single feature. It’s not parsing a stylesheet that’s the problem. It’s parsing all stylesheets. The specs for CSS and HTML are very deep, with tons of features and edge cases. Simply implementing the spec isn’t viable. It would be many man years and I’d give up before I finished.

So instead I don’t tackle big features anymore. Now, I find a small demo page and keep fixing bugs and adding features until that page renders properly (or at least as properly as I’m willing to stomach). Let me walk you through an example.

Today I wanted to style some list items with brown borders:

ul li { 
border: 1px solid brown;
}

I put this into a css file but it didn’t parse. It turns out this one thing represents a ton of missing features. ul li is an ancestor selector. I had to make this selector parse, then use it in the matching, then expand the values. Let’s walk through it.

Parsing Ancestor Selectors

First I created a unit test for the kind of selector I want to use:

#[test]
fn test_ancestor_selector() {
assert_eq!(selector().parse(b”a b”),
Ok(Selector::Ancestor(AncestorSelector{
ancestor:Box::new(Selector::Simple(SimpleSelector{
tag_name:Some(String::from(“a”)),
id: None,
Class: vec![],
})),
child:Box::new(Selector::Simple(SimpleSelector{
tag_name:Some(String::from(“b”)),
id: None,
Class: vec![],
})),
})));
}

This won’t compile yet because the model doesn’t support AncestorSelectors yet. Next I updated the Selector enum like this:

#[derive(Debug, PartialEq)]
pub enum Selector {
Simple(SimpleSelector),
Ancestor(AncestorSelector),
}

Then I added the AncestorSelector struct:

#[derive(Debug, PartialEq)]
pub struct AncestorSelector {
pub ancestor: Box<Selector>,
pub child: Box<Selector>,
}

Now that we have a place to put the values we can actually parse them. I added an ancestor selector parser rule like this:

fn ancestor<‘a>() -> Parser<‘a,u8,Selector> {
let r = alphanum_string() - space1() + alphanum_string();
r.map(|(a,b)| Selector::Ancestor(AncestorSelector{
ancestor: Box::new(a),
child: Box::new(b),
}))
}

Of course we have to call the rule to use it, so I updated the selector rule to use ancestor option like this:

fn selector<‘a>() -> Parser<‘a, u8, Selector>{
let r
= space()
+ (ancestor() | class_string() | star_string() | alphanum_string())
- space()
;
r.map(|(_, selector)| selector)
}

Now if we run the test it will pass. Yay we’re done!

Matching

Well, no. The selector is available but it’s not used yet. I still need to update the selector matching code to actually match styles which use ancestor selectors. Let’s switch from css/mod.rs to style/mod.rs , which is where all of the styles are actually matched and applied to elements. The good news is we already have a matches function that looks at the type of selector. We just need to add another arm to the match.

fn matches(elem: &ElementData, selector: &Selector, ancestors:&mut Vec::<(&Node,&PropertyMap)>) -> bool {
match *selector {
Simple(ref simple_selector) =>
matches_simple_selector(elem, simple_selector),
Ancestor(ref sel) => {
let child_match = matches(elem, &*sel.child, ancestors);
let mut parent_match = false;
if !ancestors.is_empty() {
let (parent_node,_) = &ancestors[0];
if let NodeType::Element(ed) = &parent_node.node_type {
parent_match = matches(ed, &*sel.ancestor, ancestors);
}
}
return child_match && parent_match;
}
}
}

Essentially we break the ancestor selector up into two parts: the child part and the parent part, then run each part through the same matches function again. The tricky part is that we use the current element for child and a parent element for the parent. And that means that we need a way to get the parent of each node. Unfortunately we don’t have that.

Graphs are hard in Rust

With the current DOM structure each node has a vector list of children, but it doesn’t have any reference to it’s parent. I originally tried to make a full two way tree structure like I would do in Javascript, but I couldn’t make the borrow checker accept it. It turns out these structures are very difficult to do in Rust, or at least to do them correctly and without unsafe code. One day I will tackle this problem again, but for today I found a simpler solution.

The matching process is a recursive tree traversal starting at the root node. Along the way it passes through each parent node, so we just need to pass the current ancestor to the matching function. For the time being I’m just implementing matching on the immediate parent, but since I will eventually want to support any ancestor up to the root I decided to pass in a vector of nodes along with their styles. That’s what the ancestors:&mut Vec::<(&Node,&PropertyMap)> vector in the code snippet above does. For the parent part it checks if there is at least one element in the ancestors vector (meaning we aren’t at the root) then passes that as the element for the matches function on the parent selector.

Finally, if both child and parent match, then the ancestor selector itself matches. Whew! That was a lot of work.

Okay, now we can put in a unit test for ancestor matching that actually works. It looks like this:

#[test]
fn test_ancestor_match() {
let doc_text = br#”
<b><a>rad</a></b>
“#;
let css_text = br#”
* {
color: black;
}
b a {
color:red;
}
“#;
let doc = load_doc_from_bytestring(doc_text);
let stylesheet = parse_stylesheet_from_bytestring(css_text).unwrap();
let snode = style_tree(&doc.root_node, &stylesheet);
println!(“doc is {:#?} {:#?} {:#?}”,doc,stylesheet,snode);

//check b
assert_eq!(snode.specified_values.get(“color”).unwrap(),
&Keyword(String::from(“black”)));

// check b a
assert_eq!(snode.children[0].specified_values.get(“color”).unwrap(),
&Keyword(String::from(“red”)));

}

Border Shorthand

So, that’s the first part of our rule, the selector. The second part is is the actual style rule: border: 1px solid brown;. If we put this in a CSS file now it will fail to parse. That’s because while I have code to parse an array of property values I have only implemented the cases for 1, 2, and 4. That’s because those are needed for margin: 1px, margin: 1px 2px, margin: 1px 2px 3px 4px, but I had forgotten to implement the 3 case. One more thing.

fn array_value_3<‘a>() -> Parser<‘a, u8, Value> {
let t = one_value() - space() + one_value() - space() + one_value();
t.map(|((v1,v2),v3)|{
Value::ArrayValue(vec![v1,v2,v3])
})
}

Then update the value rule to use array_value_3 as another option:

fn value<‘a>() -> Parser<‘a, u8, Value> {
call(array_value_4)
| call(array_value_3)
| call(array_value_2)
| call(list_array_value)
| one_value()
}

Then I updated the array values unit test like this: (only the last three tests are new).

#[test]
fn test_list_array_values() {
assert_eq!(array_value_2().parse(b"3px 4px"),
Ok(Value::ArrayValue(vec![Value::Length(3.0,Unit::Px),
Value::Length(4.0,Unit::Px)])));
assert_eq!(array_value_2().parse(b"3em 4.0rem"),
Ok(Value::ArrayValue(vec![Value::Length(3.0,Unit::Em),
Value::Length(4.0,Unit::Rem)])));
assert_eq!(array_value_2().parse(b"0.3em 0.4rem"),
Ok(Value::ArrayValue(vec![Value::Length(0.3,Unit::Em),
Value::Length(0.4,Unit::Rem)])));
assert_eq!(array_value_2().parse(b”.3em 0.4rem”),
Ok(Value::ArrayValue(vec![Value::Length(0.3,Unit::Em),
Value::Length(0.4,Unit::Rem)])));
assert_eq!(array_value_3().parse(b”1px solid black”),
Ok(Value::ArrayValue(vec![Value::Length(1.0,Unit::Px),
Value::Keyword(String::from(“solid”)),
Value::Keyword(String::from(“black”))])));
assert_eq!(array_value_3().parse(b”1px solid #cccccc”),
Ok(Value::ArrayValue(vec![Value::Length(1.0,Unit::Px),
Value::Keyword(String::from(“solid”)),
Value::HexColor(String::from(“#cccccc”))])));
assert_eq!(value().parse(b”1px solid #cccccc”),
Ok(Value::ArrayValue(vec![Value::Length(1.0,Unit::Px),
Value::Keyword(String::from(“solid”)),
Value::HexColor(String::from(“#cccccc”))])));
}

It’s verbose but auto-completion works wonders when writing Rust code, since it’s all statically typed. I suggest using the Rust plugins for IntelliJ or VS Code.

Once parsed I need to expand the border shorthand into it’s full version. So border: 1px solid brown becomes

  border-width-top: 1px;
border-width-right:1px;
border-width-bottom: 1px;
border-width-left: 1px;
border-color: brown;
border-style: solid;

I already have a function called expand_styles in style/mod.rs which expands CSS shorthands like margin and padding. I just need to add another case for border.

pub fn expand_styles(ss:&mut Stylesheet) {
for rule in ss.rules.iter_mut() {
match rule {
RuleType::Rule(rule) => {
let mut new_decs = vec![];
for dec in rule.declarations.iter_mut() {
// println!(“decl = {:#?}”,dec);
match dec.name.as_str() {
“margin” => expand_array_decl(&mut new_decs, dec),
“padding” => expand_array_decl(&mut new_decs, dec),
“border-width” => expand_array_decl(&mut new_decs, dec),
“border” => expand_border_shorthand(&mut new_decs, dec),
_ => new_decs.push(dec.clone()),
}
}
rule.declarations = new_decs;
}
_ => {}
}
}
}

And then expand_border_shorthand:

fn expand_border_shorthand(new_decs:&mut Vec::<Declaration>, dec:&Declaration) {
// println!(“expanding border shorthand: {:#?}”,dec);
match &dec.value {
Value::ArrayValue(vec) => {
if vec.len() != 3 {
panic!(“border shorthand must have three values”);
}
new_decs.push(Declaration{
name: String::from(“border-width-top”),
value: vec[0].clone()
});
new_decs.push(Declaration{
name: String::from(“border-width-left”),
value: vec[0].clone()
});
new_decs.push(Declaration{
name: String::from(“border-width-right”),
value: vec[0].clone()
});
new_decs.push(Declaration{
name: String::from(“border-width-bottom”),
value: vec[0].clone()
});
new_decs.push(Declaration{
name: String::from(“border-style”),
value: vec[1].clone()
});
new_decs.push(Declaration{
name: String::from(“border-color”),
value: vec[2].clone()
});
}
_ => {
panic!(“border shorthand must be an array value”);
}
}
}

Again it’s boring long code, but it works. In the future I’d like to try shortening it all with macros, but for now it’s good.

Now we can try the final unit test!

#[test]
fn test_border_shorthand() {
let doc_text = br#”<div></div>”#;
let css_text = br#”
div {
border: 1px solid black;
}
“#;

let doc = load_doc_from_bytestring(doc_text);
let mut stylesheet = parse_stylesheet_from_bytestring(css_text).unwrap();
expand_styles(&mut stylesheet);
let mut snode = style_tree(&doc.root_node, &stylesheet);
println!(“doc is {:#?} {:#?} {:#?}”,doc,stylesheet,snode);
assert_eq!(snode.lookup_length_px(“border-width-top”,5.0),1.0);
assert_eq!(snode.lookup_length_px(“border-width-right”,5.0),1.0);
assert_eq!(snode.lookup_length_px(“border-width-bottom”,5.0),1.0);
assert_eq!(snode.lookup_length_px(“border-width-left”,5.0),1.0);
assert_eq!(snode.lookup_keyword(“border-color”, &Keyword(String::from(“white”))), Keyword(String::from(“black”)));
}

It works. If I put this into my demo page everything parses and renders correctly. Victory is mine! Only a few billion more fixes like this to go. :)

Just 4.8 billion more to go.

The fix we just put in crossed the css, styling, and rendering modules. All just to implement ul li { border: 1px solid brown; } The good news is that everything had a reasonable place to go and once I got it all to compile it just worked. Also, once this was implemented a bunch of other previously broken styles I had ignored started to work now.

Building a browser is a whole bunch of little tiny bits, implemented one by one. You never really finish, of course. You asymptotically approach being finished, but never quite reach it. And that’s why everyone who isn't crazy like me just reuses an existing engine like Chromium.

Talk to me about it on Twitter

Posted March 21st, 2020

Tagged: rust browser