PDL::FAQ - Frequently asked questions about PDL
Current FAQ version: 1.008
This is version 1.008 of the PDL FAQ, a collection of frequently asked questions
about PDL - the Perl Data Language.
You can find the latest version of this document at
<
http://pdl.perl.org/?docs=FAQ&title=Frequently%20Asked%20Questions>
.
This is a considerably reworked version of the PDL FAQ. As such many errors
might have crept in and many updates might not have made it in. You are
explicitly encouraged to let us know about questions which you think should be
answered in this document but currently aren't.
Similarly, if you think parts of this document are unclear, please tell the FAQ
maintainer about it. Where a specific answer is taken in full from someones
posting the authorship should be indicated, let the FAQ maintainer know if it
isn't. For more general information explicit acknowledgment is not made in the
text, but rather there is an incomplete list of contributors at the end of
this document. Please contact the FAQ maintainer if you feel hard done by.
Send your comments, additions, suggestions or corrections to the PDL mailing
list at
[email protected]. See Q: 3.2 below for instructions
on how to join the mailing lists.
PDL stands for
Perl Data Language . To say it with the words of Karl
Glazebrook, initiator of the PDL project:
The PDL concept is to give standard perl5 the ability
to COMPACTLY store and SPEEDILY manipulate the large
N-dimensional data sets which are the bread and butter
of scientific computing. e.g. $x=$y+$z can add two
2048x2048 images in only a fraction of a second.
It provides tons of useful functionality for scientific and numeric analysis.
For readers familiar with other scientific data evaluation packages it may be
helpful to add that PDL is in many respects similar to IDL, MATLAB and similar
packages. However, it tries to improve on a number of issues which were
perceived (by the authors of PDL) as shortcomings of those existing packages.
PDL is supported by its users. General informal support for PDL is provided
through the PDL mailing list (
[email protected] , see below).
As a Perl extension (see Q: 2.5 below) it is devoted to the idea of free and
open development put forth by the Perl community. PDL was and is being
actively developed by a loosely knit group of people around the world who
coordinate their activities through the PDL development mailing list
(
[email protected] , see Q: 3.2 below). If you would like to
join in the ongoing efforts to improve PDL please join this list.
There are actually several reasons and everyone should decide for themselves
which are the most important ones:
- •
- PDL is "free software". The authors of PDL think
that this concept has several advantages: everyone has access to the
sources -> better debugging, easily adaptable to your own needs,
extensible for your purposes, etc... In comparison with commercial
packages such as MATLAB and IDL this is of considerable importance for
workers who want to do some work at home and cannot afford the
considerable cost to buy commercial packages for personal use.
- •
- PDL is based on a powerful and well designed scripting
language: Perl. In contrast to other scientific/numeric data analysis
languages it has been designed using the features of a proven language
instead of having grown into existence from scratch. Defining the control
structures while features were added during development leads to languages
that often appear clumsy and badly planned for most existing packages with
similar scope as PDL.
- •
- Using Perl as the basis a PDL programmer has all the
powerful features of Perl at their hand, right from the start. This
includes regular expressions, associative arrays (hashes), well designed
interfaces to the operating system, network, etc. Experience has shown
that even in mainly numerically oriented programming it is often extremely
handy if you have easy access to powerful semi-numerical or completely
non-numerical functionality as well. For example, you might want to offer
the results of a complicated computation as a server process to other
processes on the network, perhaps directly accepting input from other
processes on the network. Using Perl and existing Perl extension packages
things like this are no problem at all (and it all will fit into your
"PDL script").
- •
- Extremely easy extensibility and interoperability as PDL is
a Perl extension; development support for Perl extensions is an integral
part of Perl and there are already numerous extensions to standard Perl
freely available on the network.
- •
- Integral language features of Perl (regular expressions,
hashes, object modules) immensely facilitated development and
implementation of key concepts of PDL. One of the most striking examples
for this point is probably PDL::PP (see Q: 6.16 below), a code
generator/parser/pre-processor that generates PDL functions from concise
descriptions.
- •
- None of the existing data languages follow the Perl
language rules, which the authors firmly believe in:
- •
- TIMTOWTDI: There is more than one way to do it. Minimalist
languages are interesting for computer scientists, but for users, a little
bit of redundancy makes things wildly easier to cope with and allows
individual programming styles - just as people speak in different ways.
For many people this will undoubtedly be a reason to avoid PDL ;)
- •
- Simple things are simple, complicated things possible:
Things that are often done should be easy to do in the language, whereas
seldom done things shouldn't be too cumbersome.
All existing languages violate at least one of these rules.
- •
- As a project for the future PDL should be able to use super
computer features, e.g. vector capabilities/parallel processing, GPGPU
acceleration. This will probably be achieved by having PDL::PP (see Q:
6.16 below) generate appropriate code on such architectures to exploit
these features.
- •
- [ fill in your personal 111 favourite reasons here...]
Just in case you do not yet know what the main features of PDL are and what one
could do with them, here is a (necessarily selective) list of key features:
PDL is well suited for matrix computations, general handling of multidimensional
data, image processing, general scientific computation, numerical
applications. It supports I/O for many popular image and data formats, 1D
(line plots), 2D (images) and 3D (volume visualization, surface plots via
OpenGL - for instance implemented using Mesa or video card OpenGL drivers),
graphics display capabilities and implements many numerical and semi-numerical
algorithms.
Through the powerful pre-processor it is also easy to interface Perl to your
favorite C routines, more of that further below.
PDL is a Perl5 extension package. As such it needs an existing Perl5
installation (see below) to run. Furthermore, much of PDL is written in Perl
(+ some core functionality that is written in C). PDL programs are
(syntactically) just Perl scripts that happen to use some of the functionality
implemented by the package "PDL".
Since PDL is just a Perl5 package you need first of all an installation of Perl5
on your machine. As of this writing PDL requires version 5.10.x of perl, or
higher. More information on where and how to get a Perl installation can be
found at the Perl home page <
http://www.perl.org> and at many CPAN sites
(if you do not know what
CPAN is, check the answer to the next
question).
To build PDL you also need a working C compiler, support for Xsubs, and the
package Extutils::MakeMaker. If you don't have a compiler there might be a
binary distribution available, see "Binary distributions" below.
If you can (or cannot) get PDL working on a new (previously unsupported)
platform we would like to hear about it. Please, report your success/failure
to the PDL mailing list at
[email protected] . We will do our
best to assist you in porting PDL to a new system.
PDL is available as source distribution in the
Comprehensive Perl Archive
Network (or CPAN) and from the GitHub project page at
<
https://github.com/PDLPorters/pdl>. The CPAN archives contains not only
the PDL distribution but also just about everything else that is Perl-related.
CPAN is mirrored by dozens of sites all over the world. The main site is
<
http://www.cpan.org>, and local CPAN sites (mirrors) can be found
there. PDL's homepage is at <
http://pdl.perl.org> and the latest version
can also be downloaded from there.
We are delighted to be able to give you the nicest possible answer on a question
like this: PDL is *free software* and all sources are publicly available. But
still, there are some copyrights to comply with. So please, try to be as nice
as we (the PDL authors) are and try to comply with them.
Oh, before you think it is *completely* free: you have to invest some time to
pull the distribution from the net, compile and install it and (maybe) read
the manuals.
The complete PDL documentation is available with the PDL distribution. Use the
command "perldoc PDL" to start learning about PDL.
The easiest way by far, however, to get familiar with PDL is to use the PDL
on-line help facility from within the PDL shell, "pdl2" Just type
"pdl2" at your system prompt. Once you are inside the
"pdl2" shell type "help" . Using the "help" and
"apropos" commands inside the shell you should be able to find the
way round the documentation.
Even better, you can immediately try your newly acquired knowledge about PDL by
issuing PDL/Perl commands directly at the command line. To illustrate this
process, here is the record of a typical "pdl2" session of a PDL
beginner (lengthy output is only symbolically reproduced in braces ( <...
...> ) ):
unix> pdl2
pdl> help
< ... help output ... >
pdl> help PDL::QuickStart
< ... perldoc page ... >
pdl> $x = pdl (1,5,7.3,1.0)
pdl> $y = sequence float, 4, 4
pdl> help inner
< ... help on the 'inner' function ... >
pdl> $c = inner $x, $y
pdl> p $c
[22.6 79.8 137 194.2]
For further sources of information that are accessible through the Internet see
next question.
First of all, for all purely Perl-related questions there are tons of sources on
the net. Good points to start are <
http://www.perl.com> and
<
http://www.perl.org> .
The PDL home site can be accessed by pointing your web browser to
<
http://pdl.perl.org> . It has tons of goodies for anyone interested in
PDL:
- •
- PDL distributions
- •
- On-line documentation
- •
- Pointers to an HTML archive of the PDL mailing lists
- •
- A list of platforms on which PDL has been successfully
tested.
- •
- News about recently added features, ported libraries,
etc.
- •
- Name of the current pumpkin holders for the different PDL
modules (if you want to know what that means you better had a look at the
web pages).
If you are interested in PDL in general you can join the pdl-general mailing
list. This is a forum to discuss programming issues in PDL, report bugs, seek
assistance with PDL related problems, etc.
If you are interested in all the technical details of the ongoing PDL
development you can join the pdl-devel mailing list.
Subscription and current archive links to both mailing lists can be found at
<
http://pdl.perl.org/?page=mailing-lists>.
Cross-posting between these lists should be avoided unless there is a
very good reason for doing that.
The PDL project, begun in the late 1990s, has undergone considerable evolution
since that time, and the support for it has as well. Thus mailing-list
archives are in several places. Originally pdl-general was called 'perldl',
and pdl-devel was called 'pdl-porters'.
|Time Period | URL |
|------------|-------------------------------------------------------| |1996 -
2004 |
http://www.xray.mpe.mpg.de/mailing-lists/perldl/ | |1997 - 2004 |
http://www.xray.mpe.mpg.de/mailing-lists/pdl-porters/ | |2005 - 2015 |
http://perldl.jach.hawaii.narkive.com/ | |2005 - 2015 |
http://pdl-porters.jach.hawaii.narkive.com/ | |2015 - |
https://sourceforge.net/p/pdl/mailman/pdl-general/ | |2015 - |
https://sourceforge.net/p/pdl/mailman/pdl-devel/ |
|--------------------------------------------------------------------|
As of this writing (FAQ version 1.008 of 21 May 2017) the latest stable version
is 2.018. The latest stable version should always be available from a CPAN
mirror site near you (see Question 2.7 for info on where to get PDL).
The most current (possibly unstable) version of PDL can be obtained from the Git
repository, see Question 4.10 and periodic CPAN developers releases of the Git
code will be made for testing purposes and more general availability.
Over its development, PDL has used both a single floating point version number
(from the versions 1.x through 2.005) at which point it switched to a dotted
triple version for 2.1.1 onward---EXCEPT for version 2.2 which came out which
should have been 2.2.0. To simplify and unify things, PDL has reverted to a
single float version representation with PDL-2.006. This can cause dependency
problems for modules that set a minimum PDL version of 2.2. The work around
it, note that all extant PDL releases have version numbers greater than 2.2.1
so that using 0 as the minimum version will work.
Two ways that you could help almost immediately are (1) participate in CPAN
Testers for PDL and related modules, and (2) proofreading and clarifying the
PDL documentation so that it is most useable for PDL users, especially new
users.
To participate in CPAN Testers and contribute test reports, the page
<
http://wiki.cpantesters.org/wiki/QuickStart> has instructions for
starting for either "CPAN" or "CPANPLUS" users.
If you have a certain project in mind you should check if somebody else is
already working on it or if you could benefit from existing modules. Do so by
posting your planned project to the PDL developers mailing list at
[email protected] . See the subscription instructions in
Question 3.2. We are always looking for people to write code and/or
documentation ;).
First, make sure that the bug/problem you came across has not already been dealt
with somewhere else in this FAQ. Secondly, you can check the searchable
archive of the PDL mailing lists to find whether this bug has already been
discussed. If you still haven't found any explanations you can post a bug
report to
[email protected] , or through the Bugs link on
<
http://pdl.perl.org> . See the
BUGS file in the PDL distribution
for what information to include. If you are unsure, discussions via the perldl
mailing list can be most helpful.
First make sure you have read the file
INSTALL in the distribution. This
contains a list of common problems which are unnecessary to repeat here.
Next, check the file
perldl.conf to see if by editing the configuration
options in that file you will be able to successfully build PDL. Some of the
modules need additional software installed, please refer to the file
DEPENDENCIES for further details. Make sure to edit the location of
these packages in perldl.conf if you have them in non-standard locations.
N.B. Unix shell specific: If you would like to save an edited perldl.conf for
future builds just copy it as
~/.perldl.conf into your home directory
where it will be picked up automatically during the PDL build process.
Also, check for another, pre-existing version of PDL on the build system.
Multiple PDL installs in the same PATH or @INC can cause puzzling test or
build failures.
If you still can't make it work properly please submit a bug report including
detailed information on the problems you encountered to the perldl mailing
list (
[email protected] , see also above). Response is often
rapid.
Most users should not have to edit any configuration files manually. However, in
some cases you might have to supply some information about awkwardly placed
include files/libraries or you might want to explicitly disable building some
of the optional PDL modules. Check the files
INSTALL and
perldl.conf for details.
If you had to manually edit
perldl.conf and are happy with the results
you can keep the file handy for future reference. Place it in
~/.perldl.conf where it will be picked up automatically or use
"perl Makefile.PL PDLCONF=your_file_name" next time you build PDL.
For the basic PDL functionality you don't need any additional software. However,
some of the optional PDL modules included in the distribution (notably most
graphics and some I/O modules) require certain other libraries/programs to be
installed. Check the file
DEPENDENCIES in the distribution for details
and directions on how to get these.
To install PDL in a non-standard location, use the INSTALL_BASE option in the
"perl Makefile.PL" configure step. For example, "perl
Makefile.PL INSTALL_BASE=/mydir/perl5" will configure PDL to install into
the tree rooted at "/mydir/perl5". For more details see "How do
I keep my own module/library directory?" in perlfaq8 and subsequent
sections. Another alternative is to use local::lib to do the heavy lifting for
the needed configuration.
To guarantee a completely clean installation of PDL, you will need to first
delete the current installation files and folders. These will be all
directories named "PDL" in the Perl @INC path, files named
"*Pdlpp*" in any "Inline" directories, and the programs
"pdl, pdldoc, pdl2, perldl, and pptemplate". Then just build and
install as usual. This is much easier to keep track of if you always install
"PDL" into a non-standard location. See Q: 4.4 above.
Information about binary distributions of PDL can be found on
<
http://pdl.perl.org> . At present there are binary distributions of PDL
for Linux (RedHat and Debian), FreeBSD, Mac OS X and Windows, though they
might not be the most recent version.
If someone is interested in providing binary distributions for other
architectures, that would be very welcome. Let us know on the
[email protected] mailing list. Also check your Linux
distribution's package manager as many now include PDL. PPMs for win32
versions (both 32bit and 64bit) are also available.
Yes, PDL does run on Linux and indeed much of the development has been done
under Linux. On <
http://pdl.perl.org> you can find links to packages for
some of the major distributions. Also check your distribution's package
manager (yum, apt, urpmi, ...) as PDL is now found by many of these.
PDL builds fine on Win32 using MinGW or Microsoft compilers. See the
win32/INSTALL file in the PDL source distribution for details. Other
compilers have not been tested--input is welcome. There is also a distribution
of PDL through ActiveState's ppm, though it might not always be the latest
version. PDL-2.018 builds out of the box on Strawberry Perl and ActiveState
Perl and there are distributions of Strawberry Perl with bundled PDL (see
<
http://strawberryperl.com/releases.html>).
No. PDL development was conducted with a CVS repository from December 1999 to
April 2009. In April 2009 the project switched to the Git version control
system (see <
http://git-scm.com>).
Assume you have Git installed on your system and want to download the project
source code into the directory "PDL". To get read-only access to the
repository, you type at the command line
git clone git://github.com/PDLPorters/pdl
If you wish to submit changes to PDL, you should "fork" the repository
from <
https://github.com/PDLPorters/pdl>, then clone your fork in the
normal fashion.
To become an official PDL developer, you will need to be added to the GitHub
"PDLPorters" organisation.
For official PDL developers, to get read/write access to the repository type at
the command line
git clone git://github.com/PDLPorters/pdl
They can still use their own fork; at least one active developer uses that model
rather than branches on the main repository.
The best way is to check <
https://github.com/PDLPorters/pdl/pulls> to see
if somebody has submitted a pull request related to your problem.
In addition, if you are not subscribing to the mailing list, check the archive
of the "pdl-devel" and "pdl-general" mailing lists. See
Question 3.2 for details.
The first thing you should do is to read the Git documentation and learn the
basics about Git. There are many sources available online. It is very
important that you use Git "best practice", with branches, but
fortunately this is very easy! Here are the basics.
Make sure your copy is up to date with the main repo:
git checkout master
git pull --rebase # rebase in case you wrongly changed your own master
Make a branch:
git checkout -b mybranch-name
Commit your changes locally:
git add <file1> <file2> ...
git commit
or combine these two with:
git commit -a
Test the PDL before you push it to the main repository. If the code is broken
for you, then it is most likely broken for others. Luckily, the rest of this
process will test that automatically to help you catch such errors.
Then update the shared repository with your changes:
git push -u origin mybranch-name
This will still leave your changes on a branch, but this is good. Now go to the
GitHub page, <
https://github.com/PDLPorters/pdl>. It will ask you
whether you want to make a "pull request" - you do. Follow the
prompts. This will then initiate the automated "continuous
integration" tests, on Linux and Windows, with various versions of Perl,
with various compilers. You will also want to get at least one other developer
to review your changes.
Once this review process is successfully completed, you can merge your changes
to the master branch!
Until 2.075, "threading" was used to refer to two ideas, but that
ambiguity has now been resolved by using the now (as of 2022)
industry-standard term "broadcasting" for the vectorisation /
array-programming concept.
- •
- When mentioned in the INSTALL directions and
possibly during the build process we have the usual computer science
meaning of multi-threading in mind (useful mainly on multiprocessor
machines or clusters), currently (as of 2.074) POSIX threads (see
PDL::ParallelCPU).
- •
- PDL broadcasting of operations on ndarrays (as mentioned in
the indexing docs) is the iteration of a basic operation over appropriate
sub-slices of ndarrays, e.g. the inner product "inner $x, $y" of
a (3) pdl $x and a (3,5,4) pdl $y results in a (5,4) ndarray where each
value is the result of an inner product of the (3) pdl with a (3)
sub-slice of the (3,5,4) ndarray. For details check PDL::Indexing
The connection is that broadcasting divides up independent operations that can
be done in parallel.
Well, PDL scalar variables (which are instances of a particular class of Perl
objects, i.e. blessed thingies (see "perldoc perlobj" )) are in
common PDL parlance often called
ndarrays (for example, check the
mailing list archives). Err, clear? If not, simply use the term
ndarray
when you refer to a PDL variable (an instance of a PDL object as you might
remember) regardless of what actual data the PDL variable contains.
Sometimes "perldl" ("pdl2") is used as a synonym for PDL.
Strictly speaking, however, the name "perldl" ("pdl2") is
reserved for the little shell that comes with the PDL distribution and is
supposed to be used for the interactive prototyping of PDL scripts. For
details check perldl or pdl2.
Just type "help" (shortcut = "?") at the "pdl2"
shell prompt and proceed from there. Another useful command is the
"apropos" (shortcut = "??") command. Also try the
"demo" command in the "perldl" or "pdl2" shell
if you are new to PDL.
See answer to the next question why the normal Perl array syntax doesn't work
for ndarrays.
OK, you are right in a way. The docs say that ndarrays can be thought of arrays.
More specifically, it says ( PDL::QuickStart ):
I find when using the Perl Data Language it is most useful
to think of standard Perl @x variables as "lists" of generic
"things" and PDL variables like $x as "arrays" which can be
contained in lists or hashes.
So, while ndarrays can be thought of as some kind of multi-dimensional array
they are
not arrays in the Perl sense. Rather, from the point of view
of Perl they are some special class (which is currently implemented as an
opaque pointer to some stuff in memory) and therefore need special functions
(or 'methods' if you are using the OO version) to access individual elements
or a range of elements. The functions/methods to check are "at" /
"set" (see the section 'Sections' in PDL::QuickStart ) or the
powerful "slice" function and friends (see PDL::Slices and
PDL::Indexing and especially PDL::NiceSlice ).
Finally, to confuse you completely, you can have Perl arrays of ndarrays, e.g.
$spec[3] can refer to a pdl representing ,e.g, a spectrum, where $spec[3] is
the fourth element of the Perl list (or array ;) @spec . This may be confusing
but is very useful !
Most people will try to form new ndarrays from old ndarrays using some variation
over the theme: "$x = pdl([$y, 0, 2])". This does work, but may not
work in the way that a novice user would expect. (If $y has N dimensions then
$x will have N+1 dimensions.) Other ways to concatenate ndarrays are to use
the functions "cat", "append", and "glue".
Similarly you can split ndarrays using the command "dog".
This question is related to the "inplace" function. From the
documentation (see PDL::QuickStart):
Most functions, e.g. log(), return a result which is a
transformation of their argument. This makes for good
programming practice. However many operations can be done
"in-place" and this may be required when large arrays are in
use and memory is at a premium. For these circumstances the
operator inplace() is provided which prevents the extra copy
and allows the argument to be modified. e.g.:
$x = log($array); # $array unaffected
log( inplace($bigarray) ); # $bigarray changed in situ
And also from the doc !!:
Obviously when used with some functions which can not be
applied in situ (e.g. convolve()) unexpected effects may
occur!
See next question on assignment in PDL.
This is caused by the fact that currently the assignment operator "="
allows only restricted overloading. For some purposes of PDL it turned out to
be necessary to have more control over the overloading of an assignment
operator. Therefore, PDL peruses the operator ".=" for certain types
of assignments.
In Perl 5.6.7 and higher this assignment can be made using lvalue subroutines:
pdl> $x = sequence(5); p $x
[0 1 2 3 4]
pdl> $x->slice('1:2') .= pdl([5,6])
pdl> p $x
[0 5 6 3 4]
see PDL::Lvalue for more info. PDL also supports a more matrix-like slice syntax
via the PDL::NiceSlice module:
pdl> $x(1:2) .= pdl([5,6])
pdl> p $x
[0 5 6 3 4]
With versions of Perl prior to 5.6.7
or when running under the perl
debugger this has to be done using a temporary variable:
pdl> $x = sequence(5); p $x
[0 1 2 3 4]
pdl> $tmp = $x->slice('1:2'); p $tmp;
[1 2]
pdl> $tmp .= pdl([5, 6]); # Note .= !!
pdl> p $x
[0 5 6 3 4]
This can also be made into one expression, which is often seen in PDL code:
pdl> ($tmp = $x->slice('1:2')) .= pdl([5,6])
pdl> p $x
[0 5 6 3 4]
Yes you can, but not in the way you probably tried first. It is not possible to
use an ndarray directly in a conditional expression since this is usually
poorly defined. Instead PDL has two very useful functions: "any" and
"all" . Use these to test if any or all elements in an ndarray
fulfills some criterion:
pdl> $x=pdl ( 1, -2, 3);
pdl> print '$x has at least one element < 0' if (any $x < 0);
$x has at least one element < 0
pdl> print '$x is not positive definite' unless (all $x > 0);
$x is not positive definite
It is a common problem that you try to make a mask array or something similar
using a construct such as
$mask = which($ndarray > 1 && $ndarray < 2); # incorrect
This
does not work! What you are looking for is the
bitwise
logical operators '|' and '&' which work on an element-by-element basis.
So it is really very simple: Do not use logical operators on multi-element
ndarrays since that really doesn't make sense, instead write the example as:
$mask = which($ndarray > 1 & $ndarray < 2);
which works correctly.
"null" is a special token for 'empty ndarray'. A null pdl can be used
to flag to a PDL function that it should create an appropriately sized and
typed ndarray.
Null ndarrays can be used in places where a PDL function
expects an
output or
temporary argument.
Output and
temporary arguments are flagged in the
signature of a PDL
function with the "[o]" and "[t]" qualifiers (see next
question if you don't know what the
signature of a PDL function is).
For example, you can invoke the "sumover" function as follows:
sumover $x, $y=null;
which is equivalent to
$y = sumover $x;
If this seems still a bit murky check PDL::Indexing and PDL::PP for details
about calling conventions, the
signature and
broadcasting (see
also below).
The
signature of a function is an important concept in PDL. Many (but not
all) PDL function have a
signature which specifies the arguments and
their (minimal) dimensionality. As an example, look at the signature of the
"maximum" function:
'a(n); [o] b;'
this says that "maximum" takes two arguments, the first of which is
(at least) one-dimensional while the second one is zero-dimensional and an
output argument (flagged by the "[o]" qualifier). If the
function is called with ndarrays of higher dimension the function will be
repeatedly called with slices of these ndarrays of appropriate dimension(this
is called
broadcasting in PDL).
For details and further explanations consult PDL::Indexing and PDL::PP .
The short answer is: read PDL::Objects (e.g. type "help PDL::Objects"
in the
perldl or
pdl2 shell).
The longer answer (extracted from PDL::Objects ): Since a PDL object is an
opaque reference to a C struct, it is not possible to extend the PDL class by
e.g. extra data via sub-classing (as you could do with a hash based Perl
object). To circumvent this problem PDL has built-in support to extend the PDL
class via the
has-a relation for blessed hashes. You can get the
HAS-A to behave like
IS-A simply in that you assign the PDL
object to the attribute named "PDL" and redefine the method
initialize(). For example:
package FOO;
@FOO::ISA = qw(PDL);
sub initialize {
my $class = shift;
my $self = {
creation_time => time(), # necessary extension :-)
PDL => PDL->null, # used to store PDL object
};
bless $self, $class;
}
For another example check the script
t/subclass.t in the PDL
distribution.
Dataflow is an experimental project that you don't need to concern yourself with
(it should not interfere with your usual programming). However, if you want to
know, have a look at PDL::Dataflow . There are applications which will benefit
from this feature (and it is already at work behind the scenes).
Simple answer: PDL::PP is both a glue between external libraries and PDL and a
concise language for writing PDL functions.
Slightly longer answer: PDL::PP is used to compile very concise definitions into
XSUB routines implemented in C that can easily be called from PDL and which
automatically support broadcasting, dataflow and other things without you
having to worry about it.
For further details check PDL::PP and the section below on Extensions of PDL.
ndarrays behave like Perl references in many respects. So when you say
$x = pdl [0,1,2,3];
$y = $x;
then both $y and $x point to the same object, e.g. then saying
$y++;
will *not* create a copy of the original ndarray but just increment in place, of
which you can convince yourself by saying
print $x;
[1 2 3 4]
This should not be mistaken for dataflow which connects several *different*
objects so that data changes are propagated between the so linked ndarrays
(though, under certain circumstances, dataflowed ndarrays can share physically
the same data).
It is important to keep the "reference nature" of ndarrays in mind
when passing ndarrays into subroutines. If you modify the input ndarrays you
modify the original argument,
not a copy of it. This is different from
some other array processing languages but makes for very efficient passing of
ndarrays between subroutines. If you do not want to modify the original
argument but rather a copy of it just create a copy explicitly (this example
also demonstrates how to properly check for an
explicit request to
process inplace, assuming your routine can work inplace):
sub myfunc {
my $pdl = shift;
if ($pdl->is_inplace) {
$pdl->set_inplace(0)
} else {
# modify a copy by default
$pdl = $pdl->copy
}
$pdl->set(0,0);
return $pdl;
}
The current versions of PDL already support quite a number of different I/O
formats. However, it is not always obvious which module implements which
formats. To help you find the right module for the format you require, here is
a short list of the current list of I/O formats and a hint in which module to
find the implementation:
- •
- A home brew fast raw (binary) I/O format for PDL is
implemented by the FastRaw module
- •
- The FlexRaw module implements generic methods for the input
and output of `raw' data arrays. In particular, it is designed to read
output from FORTRAN 77 UNFORMATTED files and the low-level C
"write" function, even if the files are compressed or gzipped.
It is possible that the FastRaw functionality will be included in the
FlexRaw module at some time in the future.
- •
- FITS I/O is implemented by the
"wfits"/"rfits" functions in PDL::IO::FITS .
- •
- ASCII file I/O in various formats can be achieved by using
the "rcols" and "rgrep" functions, also in
PDL::IO::Misc .
- •
- PDL::IO::Pic implements an interface to the NetPBM/PBM+
filters to read/write several popular image formats; also supported is
output of image sequences as MPEG movies, animated GIFs and a wide variety
of other video formats.
- •
- On CPAN you can find the PDL::NetCDF module that works with
PDL 2.007.
For further details consult the more detailed list in the PDL::IO documentation
or the documentation for the individual modules.
Assuming all arrays are of the same size and in some format recognized by
"rpic" (see PDL::IO::Pic ) you could say:
use PDL::IO::Pic;
@names = qw/name1.tif .... nameN.tif/; # some file names
$dummy = PDL->rpic($names[0]);
$cube = PDL->zeroes($dummy->type,$dummy->dims,$#names+1); # make 3D ndarray
for (0..$#names) {
# this is the slice assignment
($tmp = $cube->slice(":,:,($_)")) .= PDL->rpic($names[$_]);
}
or
$cube(:,:,($_)) .= PDL->rpic($names[$_]);
for the slice assignment using the new PDL::NiceSlice syntax and Lvalue
assignments.
The for loop reads the actual images into a temporary 2D ndarray whose values
are then assigned (using the overloaded ".=" operator) to the
appropriate slices of the 3D ndarray $cube .
This answer applies mainly to PDL::Graphics::TriD (PDL's device independent 3D
graphics model) which is the trickiest one in this respect. You find some test
scripts in Demos/TriD in the distribution. There are also
3dtest.pl and
line3d.pl in the PDL/Example/TriD directory. After you have built PDL
you can do:
perl -Mblib Example/TriD/3dtest.pl
perl -Mblib Example/TriD/line3d.pl
to try the two TriD test programs. They only exercise one TriD function each but
their simplicity makes it easy to debug if needed with the Perl debugger, see
perldebug.
The programs in the Demo directory can be run most easily from the
"perldl" or "pdl2" interactive shell:
perl -Mblib perldl or perl -Mblib Perldl2/pdl2
followed by "demo 3d" or "demo 3d2" at the prompt.
"demo" by itself will give you a list of the available PDL demos.
You can run the test scripts in the Demos/TriD directory manually by changing to
that directory and running
perl -Mblib <testfile>
where "testfile" ; should match the pattern "test[3-9].p"
and watch the results. Some of the tests should bring up a window where you
can control (twiddle) the 3D objects with the mouse. Try using mouse button 1
for turning the objects in 3D space, mouse button 3 to zoom in and out, and
'q' to advance to the next stage of the test.
Questions like this should be a thing of the past with the PDL on-line help
system in place. Just try (after installation):
un*x> pdl2
pdl> apropos trid
Check the output for promising hits and then try to look up some of them, e.g.
pdl> help PDL::Graphics::TriD
Note that case matters with "help" but not with "apropos" .
There are a few sources of trouble with PGPLOT and PNG files. First, when
compiling the pgplot libraries, make sure you uncomment the PNG entries in the
drivers.list file. Then when running 'make' you probably got an error
like
C<make: *** No rule to make target `png.h', needed by `pndriv.o'. Stop.>
To fix this, find the line in the 'makefile' that starts with 'pndriv.o:' (it's
near the bottom). Change, for example, ./png.h to /usr/include/png.h, if that
is where your header files are (you do have the libpng and libz devel
packages, don't you?). Do this for all four entries on that line, then go back
and run "make".
Second, if you already have the PGPLOT Perl module and PDL installed, you
probably tried to write out a PNG file and got fatal error message like:
C<undefined symbol: png_create_write_struct>
This is because the PGPLOT Perl module does not automatically link against the
png and z libraries. So when you are installing the PGPLOT Perl module
(version 2.19) from CPAN, don't do "install PGPLOT", but just do
"get PGPLOT". Then exit from CPAN and manually install PGPLOT,
calling the makefile thusly:
C<perl Makefile.PL EXLIB=png,z EXDIR=/usr/lib>
assuming that there exist files such as /usr/lib/libpng.so.*,
/usr/lib/libz.so.*. Then do the standard "make;make test;make
install;" sequence. Now you can write png files from PDL!
The first stop is again "perldl" or "pdl2" and the on-line
help or the PDL documentation. There is already a lot of functionality in PDL
which you might not be aware of. The easiest way to look for functionality is
to use the "apropos" command:
pdl> apropos 'integral'
ceil Round to integral values in floating-point format
floor Round to integral values in floating-point format
intover Project via integral to N-1 dimensions
rint Round to integral values in floating-point format
Since the apropos command is no sophisticated search engine make sure that you
search on a couple of related topics and use short phrases.
However there is a good chance that what you need is not part of the PDL
distribution. You are then well advised to check out
<
http://pdl.perl.org> where there is a list of packages using PDL. If
that does not solve your problem, ask on the mailing-list, if nothing else you
might get assistance which will let you interface your package with PDL
yourself, see also the next question.
Yes, you can, in fact it is very simple for many simple applications. What you
want is the PDL pre-processor PP (PDL::PP ). This will allow you to make a
simple interface to your C routine.
The two functions you need to learn (at least first) are "pp_def"
which defines the calling interface to the function, specifying input and
output parameters, and contains the code that links to the external library.
The other command is "pp_end" which finishes the PP definitions. For
details see the PDL::PP man-page, but we also have a worked example here.
double eight_sum(int n)
{
int i;
double sum, x;
sum = 0.0; x=0.0;
for (i=1; i<=n; i++) {
x++;
sum += x/((4.0*x*x-1.0)*(4.0*x*x-1.0));
}
return 1.0/sum;
}
We will here show you an example of how you interface C code with PDL. This is
the first example and will show you how to approximate the number 8...
The C code is shown above and is a simple function returning a double, and
expecting an integer - the number of terms in the sum - as input. This
function could be defined in a library or, as we do here, as an inline
function.
We will postpone the writing of the Makefile till later. First we will construct
the ".pd" file. This is the file containing PDL::PP code. We call
this "eight.pd" .
#
# pp_def defines a PDL function.
#
pp_addhdr (
'
double eight_sum(int n)
{
int i;
double sum, x;
sum = 0.0; x=0.0;
for (i=1; i<=n; i++) {
x++;
sum += x/((4.0*x*x-1.0)*(4.0*x*x-1.0));
}
return 1.0/sum;
}
');
pp_def (
'eight',
Pars => 'int a(); double [o]b();',
Code => '$b()=eight_sum($a());'
);
# Always make sure that you finish your PP declarations with
# pp_done
pp_done();
A peculiarity with our example is that we have included the entire code with
"pp_addhdr" instead of linking it in. This is only for the purposes
of example, in a typical application you will use "pp_addhdr" to
include header files. Note that the argument to "pp_addhdr" is
enclosed in quotes.
What is most important in this example is however the "pp_def"
command. The first argument to this is the name of the new function
eight
, then comes a hash which the real meat:
- •
- This gives the input parameters (here "a") and
the output parameters (here "b"). The latter are indicated by
the "[o]" specifier. Both arguments can have a type
specification as shown here.
Many variations and further flexibility in the interface can be specified.
See "perldoc PDL::PP" for details.
- •
- This switch contains the code that should be executed. As
you can see this is a rather peculiar mix of C and Perl, but essentially
it is just as you would write it in C, but the variables that are passed
from PDL are treated differently and have to be referred to with a
preceding '$'.
There are also simple macros to pass pointers to data and to obtain the
values of other Perl quantities, see the manual page for further
details.
Finally note the call to "pp_done()" at the end of the file. This is
necessary in all PP files.
OK. So now we have a file with code that we dearly would like to use in Perl via
PDL. To do this we need to compile the function, and to do that we need a
Makefile.
use PDL::Core::Dev;
use ExtUtils::MakeMaker;
PDL::Core::Dev->import();
$package = ["eight.pd",Eight,PDL::Eight,'',1];
%hash = pdlpp_stdargs($package);
WriteMakefile( %hash );
sub MY::postamble {pdlpp_postamble($package)};
The code above should go in a file called Makefile.PL, which should subsequently
be called in the standard Perl way: "perl Makefile.PL" . This should
give you a Makefile and running "make" should compile the module for
you and "make install" will install it for you.
The fifth element in the $package array-ref is true. This tells PDL to generate
one C file per PP function, which with the right "make" options can
be compiled in parallel, for a useful speedup of development / installation.
This question is closely related to the previous one, and as we said there, the
PDL::PP pre-processor is the standard way of interfacing external packages
with PDL. The most usual way to use PDL::PP is to write a short interface
routine, see the PDL::PP perldoc page and the answer to the previous question
for examples.
However it is also possible to interface a package to PDL by re-writing your
function in PDL::PP directly. This can be convenient in certain situations, in
particular if you have a routine that expects a function as input and you
would like to pass the function a Perl function for convenience.
The PDL::PP perldoc page is the main source of information for writing PDL::PP
extensions, but it is very useful to look for files in the distribution of PDL
as many of the core functions are written in PDL::PP. Look for files that end
in ".pd" which is the generally accepted suffix for PDL::PP files.
But we also have a simple example here.
The following example will show you how to write a simple function that
automatically allows broadcasting. To make this concise the example is of an
almost trivial function, but the intention is to show the basics of writing a
PDL::PP interface.
We will write a simple function that calculates the minimum, maximum and average
of an ndarray. On my machine the resulting function is 8 times faster than the
built-in function "stats" (of course the latter also calculates the
median).
Let's jump straight in. Here is the code (from a file called
"quickstats.pd" )
#
pp_def('quickstats',
Pars => 'a(n); [o]avg(); [o]max(); [o]min()',
Code => '$GENERIC(a) curmax, curmin;
$GENERIC(a) tmp=0;
loop(n) %{
tmp += $a();
if (!n || $a() > curmax) { curmax = $a();}
if (!n || $a() < curmin) { curmin = $a();}
%}
$avg() = tmp/$SIZE(n);
$max() = curmax;
$min() = curmin;
'
);
pp_done();
The above might look like a confusing mixture of C and Perl, but behind the
peculiar syntax lies a very powerful language. Let us take it line by line.
The first line declares that we are starting the definition of a PDL:PP function
called "quickstats" .
The second line is very important as it specifies the input and output
parameters of the function.
a(n) tells us that there is one input parameter
that we will refer to as "a" which is expected to be a vector of
length n (likewise matrices, both square and rectangular would be written as
"a(n,n)" and "a(n,m)" respectively). To indicate that
something is an output parameter we put "[o]" in front of their
names, so referring back to the code we see that avg, max and min are three
output parameters, all of which are scalar (since they have no dimensional
size indicated.
The third line starts the code definition which is essentially pure C but with a
couple of convenient functions. $GENERIC is a function that returns the C type
of its argument - here the input parameter a. Thus the first two lines of the
code section are variable declarations.
The
loop(n) construct is a convenience function that loops over the dimension
called n in the parameter section. Inside this loop we calculate the
cumulative sum of the input vector and keep track of the maximum and minimum
values. Finally we assign the resulting values to the output parameters.
Finally we finish our function declaration with "pp_done()" .
To compile our new function we need to create a Makefile, which we will just
list since its creation is discussed in an earlier question.
use PDL::Core::Dev;
use ExtUtils::MakeMaker;
PDL::Core::Dev->import();
$package = ["quickstats.pd",Quickstats,PDL::Quickstats,'',1];
%hash = pdlpp_stdargs($package);
WriteMakefile( %hash );
sub MY::postamble {pdlpp_postamble($package)};
An example Makefile.PL
Our new statistic function should now compile using the tried and tested Perl
way: "perl Makefile.PL; make" .
You should experiment with this function, changing the calculations and input
and output parameters. In conjunction with the PDL::PP perldoc page this
should allow you to quickly write more advanced routines directly in PDL::PP.
If you find any inaccuracies in this document (or dis-functional URLs) please
report to the perldl mailing list
[email protected].
Achim Bohnet (
[email protected] ) for suggesting CoolHTML as a prettypodder
(although we have switched to XML now) and various other improvements.
Suggestions for some questions were taken from Perl FAQ and adapted for PDL.
Many people have contributed or given feedback on the current version of the
FAQ, here is an incomplete list of individuals whose contributions or posts to
the mailing-list have improved this FAQ at some point in time alphabetically
listed by first name: Christian Soeller, Chris Marshall, Doug Burke, Doug
Hunt, Frank Schmauder, Jarle Brinchmann, John Cerney, Karl Glazebrook, Kurt
Starsinic, Thomas Yengst, Tuomas J. Lukka.
This document emerged from a joint effort of several PDL developers (Karl
Glazebrook, Tuomas J. Lukka, Christian Soeller) to compile a list of the most
frequently asked questions about PDL with answers. Permission is granted for
verbatim copying (and formatting) of this material as part of PDL.
Permission is explicitly not granted for distribution in book or any
corresponding form. Ask on the PDL mailing list
[email protected] if some of the issues covered in here are
unclear.