id,summary,reporter,owner,description,type,status,priority,milestone,component,version,resolution,keywords,cc,os,architecture,failure,difficulty,testcase,blockedby,blocking,related
4334,Make lines stricter to fix space leak,daniel.is.fischer,simonmar,"I propose changing the implementation of lines to be stricter.

The current implementation causes (in GHC) a space leak, so it is better to sacrifice a little laziness to fix that leak.

Period of discussion: Three weeks, until 15^th^ October (because of ICFP).

Current implementation:
{{{
lines                   :: String -> [String]
lines """"                =  []
lines s                 =  let (l, s') = break (== '\n') s
                           in  l : case s' of
                                        []      -> []
                                        (_:s'') -> lines s''
}}}
The relevant part of the core showing the leak (-O or -O2, similar for -O0 but without inlining `break`):
{{{
        let {
          ds1_sjM :: ([GHC.Types.Char], [GHC.Types.Char])
          LclId
          [Str: DmdType]
          ds1_sjM =
            case GHC.List.$wbreak @ GHC.Types.Char lvl1_rjW wild_B1
            of _ { (# ww1_aj3, ww2_aj4 #) ->
            (ww1_aj3, ww2_aj4)
            } } in
        GHC.Types.:
          @ GHC.Base.String
          (case ds1_sjM of _ { (l_aif, _) -> l_aif })
          (case ds1_sjM of _ { (_, s'_aih) ->
           case s'_aih of _ {
             [] -> GHC.Types.[] @ GHC.Base.String;
             : _ s''_adj -> Lines.lines s''_adj
           }
           })
}}}
`ds1_sjM` keeps a reference to the first line until the second is demanded, preventing it from being GCed.

Current strictness properties:
{{{
lines _|_       = _|_
lines (_|_:_|_) = _|_ : _|_
}}}
Proposed implementation:
{{{
lines                   :: String -> [String]
lines """"                =  []
lines s                 = case break (== '\n') s of
                            (l, s') -> l : case s' of
                                            []      -> []
                                            (_:s'') -> lines s''
}}}
Due to the pattern matching on `break`'s result, the pair is never considered a single entity, allowing the first line to be GCed before the second is demanded.

Strictness properties of the proposed implementation:
{{{
lines _|_       = _|_
lines (_|_:_|_) = _|_
}}}
Generally, a `_|_` as the first `Char` of a line would produce a `_|_` instead of the `_|_ : _|_` which the current implementation produces. `_|_` in any other place would be treated identically.
",proposal,closed,normal,Not GHC,libraries/base,6.12.3,fixed,"lines, space leak",,Unknown/Multiple,Unknown/Multiple,Other,,,,,
