NAME
List::Gen - provides functions for generating lists
VERSION
version 0.71
SYNOPSIS
this module provides higher order functions, generators, iterators, and other utility functions for working with lists. walk lists with any step size you want, create lazy ranges and arrays with a map like syntax that generate values on demand. there are several other hopefully useful functions, and all functions from List::Util are available.
use List::Gen;
print "@$_\n" for every 5 => 1 .. 15;
# 1 2 3 4 5
# 6 7 8 9 10
# 11 12 13 14 15
print mapn {"$_[0]: $_[1]\n"} 2 => %myhash;
for (@{range 0.345, -21.5, -0.5}) {
# loops over 0.345, -0.155, -0.655, -1.155 ... -21.155
}
my $fib; $fib = cache gen {$_ < 2 ? $_ : $$fib[$_ - 1] + $$fib[$_ - 2]};
my $fac; $fac = cache gen {$_ < 2 or $_ * $$fac[$_ - 1]};
say "@$fib[0 .. 15]"; # 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610
say "@$fac[0 .. 10]"; # 1 1 2 6 24 120 720 5040 40320 362880 3628800
EXPORT
use List::Gen; # is the same as
use List::Gen qw/mapn by every range gen cap filter test cache apply zip
min max reduce/;
the following functions are available:
mapn by every range gen cap filter test cache apply zip min max reduce
glob mapkey cartesian sequence d deref slide flip expand contract
collect makegen genzip overlay iterate gather curse
and from List::Util => first max maxstr min minstr reduce shuffle sum
use List::Gen '*'; # everything
use List::Gen ':all'; # same
use List::Gen ':base'; # same as 'use List::Gen;'
FUNCTIONS
mapn CODE NUM LIST-
this function works like the builtin
mapbut takesNUMsized steps over the list, rather than one element at a time. inside theCODEblock, the current slice is in@_and$_is set to$_[0]. slice elements are aliases to the original list. ifmapnis called in void context, theCODEblock will be executed in void context for efficiency.print mapn {$_ % 2 ? "@_" : " [@_] "} 3 => 1..20; # 1 2 3 [4 5 6] 7 8 9 [10 11 12] 13 14 15 [16 17 18] 19 20 print "student grades: \n"; mapn { print shift, ": ", (reduce {$a + $b} @_)/@_, "\n"; } 5 => qw { bob 90 80 65 85 alice 75 95 70 100 eve 80 90 80 75 }; by NUM LISTevery NUM LIST-
byandeveryare exactly the same, and allow you to add variable step size to any other list control structure with whichever reads better to you.for (every 2 => @_) {do something with pairs in @$_} grep {do something with triples in @$_} by 3 => @list;the functions generate an array of array references to
NUMsized slices ofLIST. the elements in each slice are aliases to the original list.in list context, returns a real array. in scalar context, returns a generator.
my @slices = every 2 => 1 .. 10; # real array my $slices = every 2 => 1 .. 10; # generator for (every 2 => 1 .. 10) { ... } # real array for (@{every 2 => 1 .. 10}) { ... } # generatorif you plan to use all the slices, the real array is better. if you only need a few, the generator won't need to compute all of the other slices.
print "@$_\n" for every 3 => 1..9; # 1 2 3 # 4 5 6 # 7 8 9 my @a = 1 .. 10; for (every 2 => @a) { @$_[0, 1] = @$_[1, 0] # flip each pair } print "@a"; # 2 1 4 3 6 5 8 7 10 9 print "@$_\n" for grep {$$_[0] % 2} by 3 => 1 .. 9; # 1 2 3 # 7 8 9 apply {CODE} LIST-
apply a function that modifies
$_to a shallow copy ofLISTand returns the copyprint join ", " => apply {s/$/ one/} "this", "and that"; > this one, and that one zip LIST_of_ARRAYREF-
interleaves the passed in lists to create a new list.
zipcontinues until the end of the longest list,undefis returned for missing elements of shorter lists.%hash = zip [qw/a b c/], [1..3]; # same as %hash = (a => 1, b => 2, c => 3); cap LIST-
capcaptures a list, it is exactly the same assub{\@_}->(LIST)note that this method of constructing an array ref from a list is roughly 40% faster than
[ LIST ], but with the caveat and feature that elements are aliases to the original list
generators
in this document, generators will refer to tied arrays that generate their elements on demand. generators can be used as iterators in perl's list control structures such as for , map or grep . since generators are lazy, infinite generators can be created. slow generators can also be cached.
- scalar context
-
all generator functions, in scalar context, will return a reference to a tied array. elements are created on demand as they are dereferenced.
my $range = range 0, 1_000_000, 0.2; # will produce 0.0, 0.2, 0.4, ... 1000000.0 say map sprintf('% -5s', $_)=> @$range[10 .. 15]; # calculates 5 values >> 2 2.2 2.4 2.6 2.8 3 my $gen = gen {$_**2} $range; # attaches a generator function to a range say map sprintf('% -5s', $_)=> @$gen[10 .. 15]; >> 4 4.84 5.76 6.76 7.84 9the returned reference also has the following methods:
$gen->next # iterates over generator ~~ $gen->get($gen->index++) $gen->() # same. iterators return undef when past the end $gen->more # test if $gen->index not past end $gen->reset # reset iterator to start $gen->reset(4) # $gen->next returns $$gen[4] $gen->index # fetches the current position $gen->index = 4 # same as $gen->reset(4) $gen->get(index) # returns $$gen[index] $gen->(index) # same $gen->slice(4 .. 12) # returns @$gen[4 .. 12] $gen->(4 .. 12) # same $gen->size # returns scalar @$gen $gen->all # same as @$gen but faster $gen->purge # purge any caches in the source chain $gen->span # collects $gen->next calls until one # returns undef, then returns the collection. # ->span starts from and moves the ->indexthe methods duplicate/extend the tied functionality and are necessary when working with indices outside of perl's limit
(0 .. 2**31 - 1)or when fetching a list return value (perl clamps the return to a scalar with the array syntax). in most cases, they are also a little faster than the tied interface.gen, filter, test, cache, flip, reverse (alias of flip), expand, and collect are also methods of generators.
my $gen = (range 0, 1_000_000)->gen(sub{$_**2})->filter(sub{$_ % 2}); #same as: filter {$_ % 2} gen {$_**2} 0, 1_000_000; - list context
-
generator functions, in list context, can return the actual tied array. this functionality, since potentially confusing, is disabled by default.
set
$List::Gen::LIST = 1;to enable list context returns. when false, both scalar and list context return the reference.it only makes sense to use this syntax directly in list control structures such as a
forloop, in other situations all of the elements will be generated during the initial assignment from the function, which in some cases may be useful, but mostly would be a bad thing (especially for large ranges). the real tied array also does not have the above accessor methods, and can not be passed to another generator function.due to memory allocation issues with infinite generators, this feature will be removed by version 1.0 or sooner. its function (without the allocation problem) can always be achieved by wrapping a generator with
@{...}
range START STOP [STEP]-
returns a generator for values from
STARTtoSTOPbySTEP, inclusive.STEPdefaults to 1 but can be fractional and negative. depending on your choice ofSTEP, the last value returned may not always beSTOP.range(0, 3, 0.4) will return (0, 0.4, 0.8, 1.2, 1.6, 2, 2.4, 2.8) print "$_ " for @{range 0, 1, 0.1}; # 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 print "$_ " for @{range 5, 0, -1}; # 5 4 3 2 1 0 my $nums = range 0, 1_000_000, 2; print "@$nums[10, 100, 1000]"; # gets the tenth, hundredth, and thousandth numbers in the range # without calculating any other values gen CODE GENERATORgen CODE ARRAYREFgen CODE [START STOP [STEP]]-
genis the equivalent ofmapfor generators. it returns a generator that will apply theCODEblock to its source when accessed.gentakes a generator, array ref, or suitable arguments forrangeas its source. with no arguments,genuses the range0 .. infinity.my @result = map {slow($_)} @source; # slow() called @source times my $result = gen {slow($_)} \@source; # slow() not called my ($x, $y) = @$result[4, 7]; # slow() called twice my $lazy = gen {slow($_)} range 1, 1_000_000_000; same: gen {slow($_)} 1, 1_000_000_000; print $$lazy[1_000_000]; # slow() only called oncegen {...} cap LISTis a replacement for[ map {...} LIST ]and is faster thangen {...} [ LIST ].note that while effort has gone into making generators as fast as possible there is overhead involved with lazy generation. simply replacing all calls to
mapwithgenwill almost certainly slow down your code. use these functions in situations where the time / memory required to completely generate the list is unacceptable. glob-
if you export the
globfunction from this package, that function and the<*.glob>operator will have one special case overridden. if given an argument that matches the following pattern:/^ ( .+ : )? number .. number ( (by | += | -= | [-+,]) number )? ( (if | when) .+ )? $/then the arguments will be passed to
range,gen, andfilteras appropriate. any argument that doesn't match that pattern will be passed to perl's builtinglobfunction. here are a few examples:<1 .. 10> ~~ range 1, 10 <1 .. 10 by 2> ~~ range 1, 10, 2 <10 .. 1 -= 2> ~~ range 10, 1, -2 <x * x: 1 .. 10> ~~ gen {$_ * $_} 1, 10 <sin: 0 .. 3.14 += 0.01> ~~ gen {sin} 0, 3.14, 0.01 <1 .. 10 if x % 2> ~~ filter {$_ % 2} 1, 10 <sin: 0 .. 100 by 3 if /5/> ~~ filter {/5/} gen {sin} 0, 100, 3 for (@{< 0 .. 1_000_000_000 by 2 >}) { # starts instantly print "$_\n"; last if $_ >= 100; # exits the loop with only 51 values generated } my @files = <*.txt>; # works as normal iterate CODE [START STOP [STEP]]-
iteratereturns a generator that is created iteratively.iterateimplicitly caches its values, this allows random access normally not possible with an iterative algorithmmy $fib = do { my ($an, $bn) = (0, 1); iterate { my $return = $an; ($an, $bn) = ($bn, $an + $bn); $return } }; gather CODE [START STOP [STEP]]-
gatherreturns a generator that is created iteratively. rather than returning a value, you calltake($return_value)within theCODEblock. note that since perl does not have continuations,take(...)does not pause execution of the block. rather, it stores the return value, the block finishes, and then the generator returns the stored value.you can not import the
take(...)function from this module.take(...)will be installed automatically into your namespace during the execution of theCODEblock. because of this, you must always calltake(...)with parenthesis.takereturns its argument unchanged.gather implicitly caches its values, this allows random access normally not possible with an iterative algorithm. the algorithm in
iterateis a bit cleaner here, butgatheris slower thaniterate, so benchmark if speed is a concernmy $fib = do { my ($x, $y) = (0, 1); gather { ($x, $y) = ($y, take($x) + $y) } }; makegen ARRAY-
makegenconverts an array to a generator. this is normally not needed as most generator functions will call it automatically if passed an array reference sequence LIST-
string generators together. the
->applymethod is called on each argument filter CODE GENERATORfilter CODE ARRAYREFfilter CODE [START STOP [STEP]]-
filteris a lazy version ofgrepwhich attaches a code block to a generator or range. it returns a generator that will test elements with the code block on demand. with no arguments,filteruses the range0 .. infinitynormal generators, such as those produced by
rangeorgen, have a fixed length, and that is used to allow random access within the range. however, there is no way to know how many elements will pass a filter. because of this, random access within the filter is not alwaysO(1).filterwill attempt to be as lazy as possible, but to access the 10th element of a filter, the first 9 passing elements must be found first. depending on the coderef and the source, the filter may need to process significantly more elements from its source than just 10.in addition, since filters don't know their true size, entire filter arrays do not expand to the correct number of elements in list context. to correct this, call the
->applymethod which will test the filter on all of its source elements. after that, the filter will return a properly sized array. calling->applyon an infinite (or very large) range wouldn't be a good idea. if you are using->applyfrequently, you should probably just be usinggrep. you can call->applyon any stack of generator functions, it will start from the deepest filter and move up.the method
->allwill first call->applyon itself and then return the complete listfilters implicitly cache their elements. accessing any element below the highest element already accessed is
O(1).accessing individual elements or slices works as you would expect.
my $filter = filter {$_ % 2} 0, 100; say $#$filter; # incorrectly reports 100 say "@$filter[5 .. 10]"; # reads the source range up to element 23 # prints 11 13 15 17 19 21 say $#$filter; # reports 88, closer but still wrong $filter->apply; # reads remaining elements from the source say $#$filter; # 49 as it should benote:
filternow reads one element past the last element accessed, this allows filters to behave properly when dereferenced in a foreach loop (without having to call->apply). if you prefer the old behavior, set$List::Gen::FILTER_LOOKAHEAD = 0 test CODE GENERATORtest CODE ARRAYREFtest CODE [START STOP [STEP]]-
testattaches a code block to a generator or range. accessing an element of the returned generator will call the code block first with the element in$_, and if it returns true, the element is returned, otherwise an empty list (undef in scalar context) is returned.when accessing a slice of a tested generator, if you use the
->(x .. y)syntax, the the empty lists will collapse and you may receive a shorter slice. an array dereference slice will always be the size you ask for, and will have undef in each failed slot cache CODEcache GENERATORcache list => ...-
cachewill return a cached version of the generators returned by functions in this package. when passed a code reference, cache returns a memoized code ref (arguments joined with$;). when in 'list' mode, the source is in list context, otherwise scalar context is used.my $gen = cache gen {slow($_)} \@source; # calls = 0 print $gen->[123]; # calls += 1 ... print @$gen[123, 456] # calls += 1 flip GENERATOR-
flipisreversefor generators. the->applymethod is called onGENERATOR.flip gen {$_**2} 0, 10 ~~ gen {$_**2} 10, 0, -1 expand GENERATORexpand SCALE GENERATOR-
expandscales a generator with elements that return equal sized lists. it can be passed a list length, or will automatically determine it from the length of the list returned by the first element of the generator.expandimplicitly caches its returned generator.my $multigen = gen {$_, $_/2, $_/4} 1, 10; # each element returns a list say join ' '=> $$multigen[0]; # 0.25 # only last element say join ' '=> &$multigen(0); # 1 0.5 0.25 # works say scalar @$multigen; # 10 say $multigen->size; # 10 my $expanded = expand $multigen; say join ' '=> @$expanded[0 .. 2]; # 1 0.5 0.25 say join ' '=> &$expanded(0 .. 2); # 1 0.5 0.25 say scalar @$expanded; # 30 say $expanded->size; # 30 my $expanded = expand gen {$_, $_/2, $_/4} 1, 10; # in one line contract SCALE GENERATOR-
contractis the inverse ofexpandalso called
collect genzip LIST-
genzipis a lazy version ofzip. it takes any combination of generators and array refs and returns a generator. overlay GENERATOR PAIRS-
overlay allows you to replace the values of specific generator cells. to set the values, either pass the overlay constructor a list of pairs in the form
index => value, ..., or assign values to the returned generator using normal array ref syntaxmy $fib; $fib = overlay gen {$$fib[$_ - 1] + $$fib[$_ - 2]}; @$fib[0, 1] = (0, 1); # or my $fib; $fib = gen {$$fib[$_ - 1] + $$fib[$_ - 2]} ->overlay( 0 => 0, 1 => 1 ); print "@$fib[0 .. 15]"; # '0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610' recursive GENERATOR-
my $fib = gen {self($_ - 1) + self($_ - 2)} ->overlay( 0 => 0, 1 => 1 ) ->cache ->recursive; print "@$fib[0 .. 15]"; # '0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610' dd SCALARderefderef SCALAR-
dereference a
SCALAR,ARRAY, orHASHreference. any other value is returned unchangedprint join " " => map deref, 1, [2, 3, 4], \5, {6 => 7}, 8, 9, 10; # prints 1 2 3 4 5 6 7 8 9 10 mapkey CODE KEY LIST-
this function is syntactic sugar for the following idiom
my @cartesian_product = map { my $first = $_; map { my $second = $_ map { $first . $second . $_ } 1 .. 3 } qw/x y z/ } qw/a b c/; my @cartesian_product = mapkey { mapkey { mapkey { $_{first} . $_{second} . $_{third} } third => 1 .. 3 } second => qw/x y z/ } first => qw/a b c/; cartesian CODE LIST_of_ARRAYREF-
cartesiancomputes the cartesian product of any number of array refs, each which can be any size. returns a generatormy $product = cartesian {$_[0] . $_[1]} [qw/a b/], [1, 2]; @$product == qw( a1 a2 b1 b2 ); slide {CODE} WINDOW LIST-
slides a
WINDOWsized slice overLIST, callingCODEfor each slice and collecting the resultas the window reaches the end, the passed in slice will shrink
print slide {"@_\n"} 2 => 1 .. 4 # 1 2 # 2 3 # 3 4 # 4 # only one element here curse HASHREF PACKAGE-
many of the functions in this package utilize closure objects to avoid the speed penalty of dereferencing fields in their object during each access.
curseis similar toblessfor these objects and while a blessing makes a reference into a member of an existing package, a curse conjures a new package to do the reference's biddingpackage Closure::Object; sub new { my ($class, $name, $value) = @_; curse { get => sub {$value}, set => sub {$value = $_[1]}, name => sub {$name}, } => $class }Closure::Objectis functionally equivalent to the following normal perl object, but with faster method calls since there are no hash lookups or other dereferences (around 40-50% faster for short getter/setter type methods)package Normal::Object; sub new { my ($class, $name, $value) = @_; bless { name => $name, value => $value, } => $class } sub get {$_[0]{value}} sub set {$_[0]{value} = $_[1]} sub name {$_[0]{name}}the trade off is in creation time / memory, since any good curse requires drawing at least a few pentagrams in the blood of an innocent package.
the returned object is blessed into the conjured package, which inherits from the provided
PACKAGE. always use$obj->isa(...)rather thanref $obj eq ...due to this. the conjured package name matches/${PACKAGE}::_\d+/special keys:
-bless => $reference # returned instead of HASHREF -overload => [fallback => 1, '""' => sub {...}]when fast just isn't fast enough, since most cursed methods don't need to be passed their object, the fastest way to call the method is:
my $obj = Closure::Object->new('tim', 3); my $set = $obj->{set}; # fetch the closure # or $obj->can('set') $set->(undef, $_) for 1 .. 1_000_000; # call without first argwhich is around 70% faster than pre-caching a method from a normal object for short getter/setter methods.
AUTHOR
Eric Strom, <ejstrom at gmail.com>
BUGS
version 0.70 comes with a bunch of new features, if anything is broken, please let me know. see filter for a minor behavior change
versions 0.50 and 0.60 break some of the syntax from previous versions, for the better.
report any bugs / feature requests to bug-list-gen at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=List-Gen.
comments / feedback / patches are also welcome.
COPYRIGHT & LICENSE
copyright 2009 Eric Strom.
this program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
see http://dev.perl.org/licenses/ for more information.