The module Scientific.Statistics contains several elementary statistical analysis functions.
The functions average(data), variance(data), and standardDeviation(data) all take a sequence of data values and return the statistical quantity indicated by their name.
Examples:
In [4]: from numpy import *
In [5]: from Scientific.Statistics import average, variance, standardDeviation
In [6]: sqrt(arange(100.))
Out[6]:
array([ 0. , 1. , 1.41421356, 1.73205081, 2. , 2.23606798,
2.44948974, 2.64575131, 2.82842712, 3. , 3.16227766,
3.31662479, 3.46410162, 3.60555128, 3.74165739, 3.87298335,
4. , 4.12310563, 4.24264069, 4.35889894, 4.47213595,
4.58257569, 4.69041576, 4.79583152, 4.89897949, 5. ,
5.09901951, 5.19615242, 5.29150262, 5.38516481, 5.47722558,
5.56776436, 5.65685425, 5.74456265, 5.83095189, 5.91607978,
6. , 6.08276253, 6.164414 , 6.244998 , 6.32455532,
6.40312424, 6.4807407 , 6.55743852, 6.63324958, 6.70820393,
6.78232998, 6.8556546 , 6.92820323, 7. , 7.07106781,
7.14142843, 7.21110255, 7.28010989, 7.34846923, 7.41619849,
7.48331477, 7.54983444, 7.61577311, 7.68114575, 7.74596669,
7.81024968, 7.87400787, 7.93725393, 8. , 8.06225775,
8.1240384 , 8.18535277, 8.24621125, 8.30662386, 8.36660027,
8.42614977, 8.48528137, 8.54400375, 8.60232527, 8.66025404,
8.71779789, 8.77496439, 8.83176087, 8.88819442, 8.94427191,
9. , 9.05538514, 9.11043358, 9.16515139, 9.21954446,
9.2736185 , 9.32737905, 9.38083152, 9.43398113, 9.48683298,
9.53939201, 9.59166305, 9.64365076, 9.69535971, 9.74679434,
9.79795897, 9.8488578 , 9.89949494, 9.94987437])
In [7]: print average(sqrt(arange(100.)))
6.61462947103
In [8]: print standardDeviation(sqrt(arange(100.)))
2.40929952538
|
The submodule Histogram defines a histogram object.
Histogram(data, number_of_bins,(data_low,data_high)) takes a data sequence and calculates a histogram by dividing the data's range into the specified number of bins, counting how many data values fall within each bin interval. An optional third argument is a tuple specifying the range to be analyzed (lower bound data_low and upper bound data_high).
The resulting histogram is an array of rank 2 representing a sequence of bins, with each bin described by its center and its count.
Example:
In [21]: from Scientific.Statistics.Histogram import Histogram In [22]: hg = Histogram(sqrt(arange(100.)), 10) In [23]: for i in hg: ....: print i ....: ....: [ 0.49749372 1. ] [ 1.49248116 3. ] [ 2.48746859 5. ] [ 3.48245603 7. ] [ 4.47744347 9. ] [ 5.4724309 11. ] [ 6.46741834 13. ] [ 7.46240578 15. ] [ 8.45739322 17. ] [ 9.45238065 19. ] |
There are 10 histogram bins.
The first element in each row is the bin value, midway between the lower and upper bin boundaries, and the second is the count of the number of data items in that bin. That is, in the above example, there is a bin with data range (4.0,5.0) and count 9.
A common data analysis technique for times series is to estimate the frequency components. This is done with the Fourier transform, mathematically, or numerically with the Fast Fourier transform (FFT). You set this up with
In [1]: from numpy import * In [2]: from numpy.fft import * |
The FFT is estimated using the function fft(data,n=None,axis=-1). Here's an example where we generate a waveform with two sine waves.
In [3]: x = arange(64.0)
In [4]: y = sin(x/3) + cos(x/5)
In [5]: print y
[ 1. 1.30726127 1.5394308 1.6668066 1.66864461 1.53571026
1.27165518 0.89305302 0.4280731 -0.08608209 -0.6067148 -1.08977817
-1.49419621 -1.78590325 -1.94117726 -1.94891677 -1.81162417 -1.54499643
-1.17617391 -0.7408407 -0.27949239 0.16672578 0.560164 0.87035526
1.07685723 1.17095629 1.15606789 1.04681136 0.86688311 0.64598274
0.41614918 0.20792581 0.04678916 -0.04975761 -0.07410214 -0.02924059
0.0717784 0.20761356 0.35138783 0.47412246 0.5484515 0.55218675
0.4713187 0.30210095 0.05196685 -0.26084242 -0.60891547 -0.95840818
-1.27259117 -1.51582361 -1.65751878 -1.67566314 -1.55949548 -1.31104093
-0.94531715 -0.48917279 0.02113347 0.54336808 1.03341605 1.44960415
1.75679921 1.92984989 1.95600997 1.83609022]
In [6]: abs(fft(y)[0:32])
Out[6]:
array([ 6.18571506, 7.08187202, 39.9364864 , 24.3758143 ,
15.59164636, 5.63184873, 3.35647352, 2.37098615,
1.83040175, 1.49316891, 1.26476944, 1.10094631,
0.97837048, 0.88365397, 0.80860264, 0.74794032,
0.69812731, 0.65670695, 0.6219252 , 0.59249929,
0.56747197, 0.54611656, 0.52787358, 0.51230736,
0.49907573, 0.48790847, 0.47859185, 0.47095727,
0.46487304, 0.46023825, 0.45697846, 0.4550425 ])
|
Slicing was used to extract only the first 32 data samples. Since the FFT is symmetric for data represented by real (rather than complex) values, the last half of the FFT is a mirror to the first. And so, we only needed the first half.
Also, we took the absolute value abs() of the FFT to look at only the magnitude of the frequency components. In other words, we calculated the power spectral density—the power in each frequency bin.
It's a bit hard to see what's going on in the output FFT, such as locating the original two sine waves, which should appear as two, separate peaks. This is partly due to our using a too-short time series. It's also partly due to having to look at the raw numbers. We'll come back to this shortly, when we learn how to plot data.
To convert from frequencies back to time values use the inverse FFT ifft(data).
Both functions take an optional second argument specifying the length of the data sequence (by default the length of the array). If this is larger than the array, an appropriate number of zeros are added to make the effective length a power of 2. For improved efficiency, therefore, the data array length should be a power of 2. An optional third argument specifies the dimension along which the transform is performed.
Finally, the function fft2(a) performs an FFT of two-dimensional data. An optional second argument specifies the shape (a tuple of length two) of the subarray to be transformed; by default the whole array is transformed. An optional third argument (also a tuple of length two) specifies the two dimensions for the transformation.
The Scientific package's module LeastSquares provides leastSquaresFit(model, parameter, data) to perform a general nonlinear least-squares fit (using the Levenberg-Marquardt algorithm).
The argument model specifies the function that you want to fit. It is called with two arguments: The first is a tuple containing the model parameters and the second is the first element of a data point (see below). The return value must be a number.
The argument parameter is a tuple of initial values for the fit parameters. The speed of convergence of the method depends on the quality of the initial values, so they should be chosen with care.
The argument data is a sequence of data points to which the model is fit. Each data point is a sequence of length two or three. The first element of each data point specifies the independent variables of the model and is passed to the model function as its first parameter, but not used in any other way. The second element of each data point tuple is the value that the model function is supposed to approximate. The third element (which defaults to 1.) is the variance of the data point—that is, the inverse of its statistical weight in the fitting procedure.
Example:
First we define a model—an exponential function—and then create some data. These get passed to the least squares fitting function.
In [1]: from Scientific.Functions.LeastSquares import leastSquaresFit In [2]: import numpy In [3]: def exponential(parameters, x): ...: a = parameters[0] ...: b = parameters[1] ...: return a*numpy.exp(-b/x) ...: In [4]: data = [(100, 4.999e-8), ...: (200, 5.307e+2), ...: (300, 1.289e+6), ...: (400, 6.559e+7)] In [5]: fit = leastSquaresFit(exponential, (1e13, 4700), data) In [6]: print "Fitted parameters:", fit[0] Fitted parameters: [8641551709749.8301, 4715.4677901570494] In [7]: print "Fit error:", fit[1] Fit error: 1080.2526438 |
In numerical calculations many functions are defined by values on a grid of points. Evaluating these functions for off-grid points requires interpolation. The module Scientific.Functions.Interpolation defines such functions and provides some common operations on them. The grids must be orthogonal, but do not have to be equidistant. A grid is defined by a set of axes, one per dimension, specifying the coordinates of the grid points.
A function is defined by InterpolatingFunction(grid, values), where grid is a sequence of axes whose length determines the grid's dimensionality. The argument values must be an array with a rank equal to or higher than the dimensionality of the grid. If it is higher, then the function has multiple values (e.g., a vector field). An optional third argument specifies a value for the function beyond the grid. If none is given, evaluation of the function outside the grid results in an error.
An interpolating function is called like any other function. The number of arguments must be equal to the dimensionality of the grid, and the arguments must be integers or real numbers.
Interpolating functions offer some useful operations in addition to simple evaluation:
Example:
First, we define an axis with 10 points, make some "data" (via square roots), create the interpolating function, and then evaluate it.
In [1]: from Scientific.Functions.Interpolation import InterpolatingFunction In [2]: import numpy; N = numpy In [3]: axis = N.arange(0, 1.1, 0.1) In [4]: values = N.sqrt(axis) In [5]: f = InterpolatingFunction((axis,), values) In [6]: for x in N.arange(0, 1., 0.13): ...: print "%5.2f: %10.5f %10.5f" % (x, N.sqrt(x), f(x)) ...: ...: 0.00: 0.00000 0.00000 0.13: 0.36056 0.35552 0.26: 0.50990 0.50752 0.39: 0.62450 0.62398 0.52: 0.72111 0.72060 0.65: 0.80623 0.80563 0.78: 0.88318 0.88287 0.91: 0.95394 0.95381 |
Note that to make the output readable I've used a formatting string in the print function. See String Formatting in the Strings chapter of Learning Python.
Note also here how the abbreviations for the module names are created. Modules are data objects, so they can be assigned to variables.
NumPy's random module provides support for generating random numbers, including multi-dimensional arrays of them.
Random numbers find their way into many simulation algorithms representing the element of chance or noisy measurement data. Since an algorithm is used to generate the numbers, they are not truly “random” but rather have statistical properties that emulate what we think “truly random” is.
In fact, it's very hard to generate good approximates of random numbers. Many of the random number generators are equivalent to chaotic dynamical systems and these typically have some structure somewhere. But I digress ... only temporarily.
Like most random number packages, the random module allows the user to pick a seed value that initializes the random number generator to a known state. Picking a different seed each time a program starts, for example, varies the randomness across the runs. If you need an identical series of “random” numbers from run to run, however, simply fix the seed value to be the same each run.
To use the random, import it as follows:
In [1]: from numpy import * In [2]: from numpy.random import * |
Then set the seed value like this:
In [3]: seed(2345) |
If no arguments are specified, values will be substituted based on the system clock— a common and useful default. Then each time you start the program you'll get a different realization from the random number generator. Typically, seed() is only called at the start of a session or when you want to restart the generated sequence.
Random numbers are specified by their distribution. Different distributions are used to simulate different natural processes. One common distribution is the uniform distribution. This is used to model equally random choices between an upper and lower values. Another common choice is the normal or Gaussian distribution.
The random module computes sequences with either of these distributions as well as others, including binomial, Poisson, chi-square, F, gamma, beta, and exponential.
All of the distribution functions accept a shape tuple specifying the desired output matrix size. In addition, some functions also require information about the desired distribution. The following code computes a 3-by-3 matrix of normally distributed random numbers with a mean of zero and a variance of 1.0.
In [4]: a = normal(0.0,1.0,(3,3))
In [5]: a
Out[5]:
array([[-0.95129875, 1.7687718 , -1.14182693],
[ 0.71075487, 0.51095064, 1.14902943],
[-0.53846023, -0.73664509, -0.07599648]])
|
See the NumPy manual for further information.
The numpy package also contains operations that are particularly useful for dealing with matrices (arrays of rank 2) and vectors (arrays of rank 1).
In [17]: a = array([1., 2., 3.]) In [18]: b = array([0., 1., 0.]) In [19]: dot(a,b) Out[19]: 2.0 |
It can also be used for a vector and a matrix (vector-matrix product):
In [22]: m = array([[1.,0.,0.],[0.,1.,0.],[0.,0.,1.]]) In [23]: print m [[ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]] In [24]: dot(a,m) Out[24]: array([ 1., 2., 3.]) In [25]: dot(m,a) Out[25]: array([ 1., 2., 3.]) |
Finally, it also applies to two matrices giving the matrix product.
In [28]: mm = array([[1.,2.,3.],[4.,5.,6.],[7.,8.,9.]])
In [29]: dot(m,mm)
Out[29]:
array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]])
|
Note that the last dimension of m must have the same length as the first dimension of mm.
In [30]: transpose(mm)
Out[30]:
array([[ 1., 4., 7.],
[ 2., 5., 8.],
[ 3., 6., 9.]])
|
This function can actually be applied to arrays of any rank and in general reverses the order of the dimensions.
In [31]: mmm = array([[1.,2.,3.],[4.,5.,6.]])
In [32]: transpose(mmm)
Out[32]:
array([[ 1., 4.],
[ 2., 5.],
[ 3., 6.]])
|
In [35]: identity(5)
Out[35]:
array([[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]])
|
In [36]: ravel(mmm) Out[36]: array([ 1., 2., 3., 4., 5., 6.]) |
Note that the order of elements is defined by the first index varying most slowly. So, for example, the elements of a 2 x 2 array are returned in the order (0,0), (0,1), (1,0), (1,1).
In [37]: reshape(mmm,(3,2))
Out[37]:
array([[ 1., 2.],
[ 3., 4.],
[ 5., 6.]])
|
This is useful when constructing special matrices. The shape must lead to the same total number of elements.
In [39]: resize(mmm,(3,3))
Out[39]:
array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 1., 2., 3.]])
|
The NumPy submodule numpy.linalg contains common matrix operations, where the matrices are represented by two-dimensional arrays. Some operations require square matrices and so check their input accordingly. All operations are available for real and for complex matrices. The linear algebra functions are based on time-tested and efficient LAPACK routines.
Let's set up the environment and create a matrix so that we can try the routines:
In [1]: import numpy as N In [2]: import numpy.linalg as LA In [3]: a = N.reshape(N.arange(25.), (5, 5)) + N.identity(5) In [4]: print a [[ 1. 1. 2. 3. 4.] [ 5. 7. 7. 8. 9.] [ 10. 11. 13. 13. 14.] [ 15. 16. 17. 19. 19.] [ 20. 21. 22. 23. 25.]] |
The function solve(A, B) solves a system of linear equations with a square nonsingular matrix A and a righthand-side vector B. That is, it finds x such that A x = B. Several righthand-side vectors can be treated simultaneously by making B a two-dimensional array, that is, a sequence of vectors. The resulting x is then a matrix.
In particular, the function inv(A) calculates the inverse of the square nonsingular matrix A by calling solve(A, B) with a suitable B—the identity matrix. The resulting matrix x is the inverse of A.
In [5]: inv_a = LA.inv(a) In [6]: print inv_a [[ 0.20634921 -0.52380952 -0.25396825 0.01587302 0.28571429] [-0.5026455 0.63492063 -0.22751323 -0.08994709 0.04761905] [-0.21164021 -0.20634921 0.7989418 -0.1957672 -0.19047619] [ 0.07936508 -0.04761905 -0.17460317 0.6984127 -0.42857143] [ 0.37037037 0.11111111 -0.14814815 -0.40740741 0.33333333]] |
We can test that this is, in fact, the inverse by printing the largest magnitude element of a * a^{-1} - identity(5):
In [7]: print "Inversion error:", \ ...: N.maximum.reduce(N.fabs(N.ravel(N.dot(a, inv_a)-N.identity(5)))) Inversion error: 1.7763568394e-15 |
Note how the long command line was continued on to the next by using a backslash.
The function eigvals(A) returns the eigenvalues of the square matrix A.
In [8]: print LA.eigvals(a) [ 64.91164992 -2.91164992 1. 1. 1. ] |
The function eig(A) returns both eigenvalues and eigenvectors—the latter as a two-dimensional array.
In [9]: print LA.eig(a)[0] [ 64.91164992 -2.91164992 1. 1. 1. ] In [10]: print LA.eig(a)[1] [[-0.0851802 0.67779864 -0.49548253 -0.19051114 -0.22160346] [-0.23825372 0.36348873 0.31510631 0.07519503 -0.14494105] [-0.39132723 0.04917881 0.27307212 -0.13066882 0.33201444] [-0.54440074 -0.2651311 0.49046692 0.79779709 0.6572081 ] [-0.69747425 -0.57944101 -0.58316283 -0.55181217 -0.62267803]] |
The function det(A) returns the determinant of the square matrix A.
In [11]: print LA.det(a) -189.0 |
The function svd(A) returns the singular value decomposition of matrix A: three arrays V, S, and WT whose matrix product is the original matrix A = V S WT. V and WT are unitary matrices. The returned S is a vector (rank-1 array) of singular values, the diagonal elements of the singular-value matrix S. This function is mainly used to check whether (and in what way) a matrix is ill-conditioned.
In [12]: V, S, W = LA.svd(a)
In [13]: print V
[[ -7.25057693e-02 -7.71195769e-01 6.32455532e-01 4.31525864e-16
-1.15073564e-16]
[ -2.30109765e-01 -4.97040739e-01 -6.32455532e-01 5.47172583e-01
2.45390445e-02]
[ -3.87713761e-01 -2.22885709e-01 -3.16227766e-01 -7.47853768e-01
3.75119637e-01]
[ -5.45317758e-01 5.12693211e-02 -2.22044605e-16 -1.45810213e-01
-8.23856408e-01]
[ -7.02921754e-01 3.25424351e-01 3.16227766e-01 3.46491398e-01
4.24197727e-01]]
In [14]: print S
[ 70.81579622 2.66889607 1. 1. 1. ]
In [15]: print W
[[ -3.86049372e-01 -4.15649728e-01 -4.45250085e-01 -4.74850441e-01
-5.04450797e-01]
[ 6.71539934e-01 3.56700580e-01 4.18612254e-02 -2.72978129e-01
-5.87817483e-01]
[ 6.32455532e-01 -6.32455532e-01 -3.16227766e-01 1.66533454e-16
3.16227766e-01]
[ -0.00000000e+00 5.47172583e-01 -7.47853768e-01 -1.45810213e-01
3.46491398e-01]
[ 0.00000000e+00 2.45390445e-02 3.75119637e-01 -8.23856408e-01
4.24197727e-01]]
|
The function lstsq(A, b, cutoff) returns the least-squares solution x to an overdetermined system of linear equations that minimizes the residual of || A x - b ||. The (optional) third argument cutoff indicates the threshold for the range of singular values; defaults to cutoff = 10-10. There are four return values: (i) the least-squares solution itself x, (ii) the sum of the squared residuals (i.e., the quantity minimized by the solution), (iii) the rank of matrix A, and (iv) the singular values of A in descending order.
In [16]: b = [1.,1.,1.,1.,1.] In [17]: x, resid, rank, s = LA.lstsq(a,b) In [18]: print x [-0.26984127 -0.13756614 -0.00529101 0.12698413 0.25925926] In [19]: print resid [] In [20]: print rank 5 In [21]: print s [ 70.81579622 2.66889607 1. 1. 1. ] |
In general, resid is returned as an empty array when the rank of a is less than or equal to the number of columns of a or greater than or equal to the number of columns.
In [21]: from numpy import * In [22]: from Scientific.Functions.FindRoot import newtonRaphson In [23]: from math import pi In [24]: def func(x): ....: return (2*x*cos(x) - sin(x))*cos(x) - x + pi/4.0 ....: In [25]: newtonRaphson(func, 0.0, 1.0, 1.0e-12) Out[25]: 0.95284786465494198 |
Here's how we can find the two fixed point of the logistic map at r = 4.0: We find the root x - f(x).
In [13]: def LogisticMap(x): ....: return x - 4.0 * x * (1.0 - x) ....: In [14]: newtonRaphson(LogisticMap,0.,1.,1.0e-6) Out[14]: 0.0 In [15]: newtonRaphson(LogisticMap,.1,1.,1.0e-6) Out[15]: 0.75000000000000655 |
Note how I changed the range in which to look for the root in order for the routine to find the fixed point at x = 3/4.
There are additional modules for scientific applications that not covered here which provide, for example, statistical parameter estimation and function interpolation. See the Scientific package manual for documentation and the NumPy manual and SciPy documentation.
There is no single, standard full-featured plotting module for Python. In fact, there are multiple choices—including GnuPlot, matplotlib, VRML, and visual Python—each with its own features. We'll work with several, choosing whichever meets on our needs.
[Unix/Mac] If not already running, start up the X11 Windows server to see the plots generated.
[Windows] Any display set up here?
I started with ipython -p pylab, this reads the ipythonrc-pylab start-up script in the .ipythonrc subdirectory. With this, I don't have to import the various SciPy and matplotlib functions one by one.
If you didn't start up that way, use
In [1]: from pylab import * |
Here's the plot() function. Let's find out what it does by typing plot? at the iPython prompt:
In [2]: plot? Base Class: |
And lots more documentation. Well developed objects in Python typically have lots of information about themselves, which is accessed via the `?' query in iPython. Page through the documentation using the space-bar and exit by typing `q'.
plot() has many options, as you can see. Read over the documentation to check these out.
Let's try an example:
In [3]: plot([1,2,3], [1,2,3], 'go-', label='line 1',linewidth=2) Out[3]: [<:matplotlib.lines.Line2D instance at 0x3415058>] In [4]: plot([1,2,3], [1,4,9], 'rs', label='line 2') Out[4]: [<matplotlib.lines.Line2D instance at 0x3415cd8>] In [5]: axis([0, 4, 0, 10]) Out[5]: [0, 4, 0, 10] In [6]: legend() Out[6]: <matplotlib.legend.Legend instance at 0x3415e40> In [7]: show() |
On most platforms, you have to show() the plot to get a window on the screen that displays what you've built up with the preceding commands.
show() waits for you to exit, which under Unix (at least) happens when you close the graphics window.
Curious about what axis and legend do? Then type axis? and legend?. Again, lots of documentation.
Let's combine the Histogram function from the Statistics module with plot() to take a look at where the iterates of the logistic map visit the interval.
In [1]: from numpy import * In [2]: from pylab import * In [3]: from Scientific.Statistics.Histogram import Histogram In [4]: xlist = zeros((20),float) In [5]: ylist = zeros((20),float) In [6]: its = zeros((1000),float) In [7]: x = 0.3 In [8]: for i in xrange(1000): ...: x = 4.0 * x * (1.0 - x) ...: its[i] = x ...: ...: In [9]: hg = Histogram(its,20) In [10]: for i in xrange(20): ....: xlist[i] = hg[i][0] ....: ylist[i] = hg[i][1] ....: ....: In [11]: plot(xlist,ylist) Out[11]: [<matplotlib.lines.Line2D instance at 0x3182378 |
Looks like the iterates of the logistic map spend most of their time near 0.0 and near 1.0.
Similarly, we can develop a a better view of the FFT example above. Rather than a list of numbers, we look at its plot:
In [1]: from numpy import * In [2]: from numpy.fft import * In [3]: from pylab import * In [4]: x = arange(128.0) In [5]: y = sin(x/3)+cos(x/5) In [6]: plot(x,y) Out[6]: [<matplotlib.lines.Line2D instance at 0x35988f0>] In [7]: show() |
Quit the plotting window.
Now let's take the FFT:
In [9]: plot(x[0:64],abs(fft(y)[0:64])) Out[9]: [<matplotlib.lines.Line2D instance at 0x3739c60^gt;] In [10]: show() |
The two frequencies are much easier to see with a longer time series and using a graphics display compared to the list of numbers we had before.
We can also work with two-dimensional (matrix) data.
In [1]: from numpy import *
In [2]: from numpy.fft import *
In [3]: from pylab import *
In [4]: x = arange(64.0)
In [5]: y = sin(x/3) + cos(x/5)
In [6]: z = cos(x/2)
In [7]: a = add.outer(y,z)
In [8]: print a
Out[8]:
array([[ 2. , 1.87758256, 1.54030231, ..., 1.60905598,
1.91474236, 1.99646791],
[ 2.30726127, 2.18484384, 1.84756358, ..., 1.91631725,
2.22200363, 2.30372918],
[ 2.5394308 , 2.41701336, 2.0797331 , ..., 2.14848677,
2.45417315, 2.5358987 ],
...,
[ 2.92984989, 2.80743245, 2.47015219, ..., 2.53890586,
2.84459224, 2.92631779],
[ 2.95600997, 2.83359253, 2.49631227, ..., 2.56506594,
2.87075233, 2.95247788],
[ 2.83609022, 2.71367279, 2.37639253, ..., 2.4451462 ,
2.75083258, 2.83255813]])
In [9]: a.shape
Out[9]: (64, 64)
In [10]: imshow(a)
Out[10]: <matplotlib.image.AxesImage instance at 0x2d7dda0>
In [11]: show()
|
We can then take the 2D FFT:
In [12]: a_fft = abs(fft2(a)) In [13]: imshow(a_fft) Out[13]: <matplotlib.image.AxesImage instance at 0x2d96c88> In [14]: show() |
You can download a truckload of matplotlib examples.
[Mac OS X] These examples are already provided in the fink installation directory:
/sw/share/doc/matplotlib-py25/examples/ |
They will run as-is under Unix.
[Windows] Where are they?
[Linux] Where are they?
Gnuplot:
Plotting can also be delegated to an external program. Gnuplot is one such example. It contains a function for simple line plots: plot. This is the same name as above, but with a different range of abilities. See plot? for its features.
Gnuplot is useful for generating not only on-screen plots, but also PostScript output, which is helpful in documenting your work and producing high-resolution images and hardcopy.
Gnuplot's function plot takes an arbitrary number of arguments, each specifying a data set. All data sets will be plotted in the same graph. A data set is a sequence of points. A point is specified by a two-element sequence (x and y coordinates) or by a simple number, which is interpreted as the y coordinate, the x coordinate being the index of the point in the sequence.
Example:
In [1]: from numpy import * In [2]: from IPython.numutils import * In [3]: from IPython.GnuplotInteractive import * *** Type `gphelp` for help on the Gnuplot integration features. In [4]: x=frange(0,2*pi,npts=100) In [5]: plot(x,sin(x)) |
Note that I used the IPython.numutils floating point frange() function frange, rather than NumPy's arange(). It's slightly more compact.
Adding the filename argument allows one to produce, for example, an Encapsulated PostScript file with the same graph:
In [6]: plot(x,sin(x),filename="t.eps") In [7]: !more t.eps %!PS-Adobe-2.0 EPSF-2.0 %%Title: t.eps %%Creator: gnuplot 4.0 patchlevel 0 %%CreationDate: Mon Apr 24 11:42:59 2006 %%DocumentFonts: (atend) ... |
The output file type is determined by the filename extension, which in this case is eps.
Try viewing the file t.eps in a PostScript viewer.
Gnuplot's website has extensive documentation and examples. The Python interface appears to be largely compatible with the standalone version the website describes.