https://aturon.github.io/README.html
I suggest we use AST which is more precise than regexp-based lexer.
I found that in the Rust Guidelines, “Guidelines by Rust feature” may be difficult to implement without AST. But we can start from style guides first.
Pitfalls of AST:
http://golang.org/src/cmd/gofmt/gofmt.go
https://aturon.github.io/README.html
Question 1: Analyze the following toy program:
// test/case1.rs fn main() { let mut i = 3; i = 4; println!("Hello world {}!", i); } |
My program gives following sample token stream output.
TokenAndSpan { tok: Comment, sp: Span { lo: BytePos(0), hi: BytePos(16), expn_id: ExpnId(4294967295) } } TokenAndSpan { tok: Whitespace, sp: Span { lo: BytePos(16), hi: BytePos(17), expn_id: ExpnId(4294967295) } } |
I can see two things are lost: the content of Comment and place of LineBreak.
For this, I can directly slice the bytes from source file to get Comment/Whitespace content, but I am not sure whether this is the best way.
Question 2: From the token stream, how to insert line breaks to keep line length smaller than 99?
Look-ahead to check if breaking the line from beginning of the token can solve the problem (need to consider the correct indentation of new line).
Problem with this algorithm: If the line is only ~5 characters longer than limit, the new line will be super short which is not a charm. Breaking from the middle would be a better solution.
Another problem: for a large predicate we want to break down at boolean operators. There are several style options: before boolean operators or after boolean operators. But the algorithm above is unaware of boolean operators: it will break line at an arbitrary token.
So I think for each overflowed line, there are some tokens which have higher priority to insert break down point. For example, tokens in the 60%~70% of the line are more likely to have a line break, position after boolean operators are more likely to have line break.
// TODO: Read clang-fmt or gnu-indent to see how they solve this problem.
Question 3: How to get correct indentation for current new line?
Target before March 17 (Proposal submission):
Explicitly write down the algorithm for line break problem. (And possibly indentation problem)
Target before March 27 (Proposal deadline):
Implement the short line rule and make it work.