StochasticProgramming
From OpenOpt
| Made by Dmitrey |
Contents |
Preface
Stochastic programming and optimization is much more difficult than ordinary nonlinear optimization. For example, having 100 stochastic binary variables, calculating exact CDF of their sum consumes 2^100 operations - you are not capable of neither calculating it even one time nor storing it in memory of ordinary computer. Thus usually approximate algorithms are used with different performance and accuracy.
Currently the FuncDesigner functionality for stochastic programming has only one approach to solve optimization problems and only limited set of parameters to improve performance via lowering accuracy and vise versa. You can find in internet some other SP solvers that may be superior to our soft for some types of problems, but using our module you will involve modern computer language Python with great flexibility and RAD features, probably along with NumPy / SciPy, especially with scipy.stats module, and save your time due to easy and efficient programming constructs that don't require much time and efforts to learn. Sometimes it's better to spend 1 week for coding and 1 hour to run program than 1 min to run program and 2 weeks for coding it in an obsolete programming language with obsolete programming constructs.
Also, we have lots of future plans to improve and enhance our stochastic programming functionality and computations speed, but it first of all depends on financial resources we will be capable to gain for it. Some other developers of optimization software have finance support either from universities, or their government foundations, in Ukraine getting something like that is almost impossible and salaries in Ukrainian Science Academy are ~ 300-400 €, that is 20-30 times lower than in USA or West Europe. Thus developing all our software for free is too hard to keep it competitive vs our competitors. Half of obtained finances will go to the stochastic module development and other half - for general OpenOpt Suite development.
Due to this and some other reasons we had to write some proprietary software in addition to our free OpenOpt/FuncDesigner, and we decided to choose stochastic programming for this. Thus, at least for several years, our Stochastic Programming addon for free OpenOpt Suite is free only for academic/research purposes and only for small-scaled problems (less than 4 free variables, less than 4 stochastic variables, less than 4 fixed variables, including stochastic ones). Also, currently our soft has another one important limitation: dependency of any function (objective or constraint) on some stochastic variables it is built on must to be tree, e.g. for stochastic variables a,b,c you can handle funcs like a + b + a*any_func(nonstochastic_variables), but not a*b*a or a*b+a (because it will involve computation on stochastic variables / functions that are not independent; as for sum, it is a little bit optimized to rearrange summation sequence to try summation in correct order, thus a+b+sin(a) + cos(b) is ok). If you form a stochastic function that FuncDesigner cannot handle yet, you will be automatically informed (FuncDesignerException will be raised). Future plans include handling bigger variety of cases like mentioned above.
For additional information about prices contact us. Currently USA prints new dollars very intensively and thus Ukraine and many other countries (who directly or approximately adjust their money units with dollars) has essential rate of inflation and hence many products has no choice than rapid increasing their prices. Yet to make our software available to as wide auditory as possible we will try to keep our prices several times less than our competitors have, also there are some possible discounts, including region-based ones.
If you would like to stick FuncDesigner SP with your software that has a free OSI-approved license, we hope we will find a solution with essentially reduced cost for you.
You could try the stochastic programming addon even without installation, via our sage server (example), pay attention to switching from Sage to Python language.
Documentation
Prior to learning FuncDesigner Stochastic Programming documentation, it is recommended to read documentation of FuncDesigner (see our Doc page), NumPy and SciPy (especially module scipy.stats, see its list of functions and basic tutorial).
Creating distributions
To create discrete distribution you can use function distribution.discrete(values, probabilities), e.g.:
from FuncDesigner import * values = [1, -1, 2, 3] probabilities = [0.1, 0.3, 0.2, 0.4] A = distribution.discrete(values, probabilities) print(A.values, A.probabilities) # [ 1. -1. 2. 3.] [ 0.1 0.3 0.2 0.4]
Using A.cdf.plot() (requires matplotlib installed) you can get the picture:
For continuous and superposition of continuous and discrete distributions you will get cdf pictures like this:
For continuous distributions you can also use A.pdf.plot() and get picture like this:
For creating discrete and continuous distributions via known ppf function (Percent point function, inverse of cdf — percentiles) you can use distribution.discrete() and distribution.continuous() with parameter ppf, e.g.
from FuncDesigner import * from scipy import stats A = distribution.continuous(ppf = stats.norm(4, 5).ppf) # gauss distribution with mean = 4, std = 5
For the sake of simplicity we have provided simplified interface for some most common distributions (for now it's norm=normal=gauss, expon=exponential and uniform), so for example you can use
A = distribution.gauss(4, 0.5) # gauss distribution with mean = 4, std = 0.5 B = distribution.exponential(3, 0.7) # location = 3, scale = 0.7 C = distribution.uniform(-1.5, 1.5) # uniform distribution from -1.5 to 1.5
For computations on distributions and ordinary scalar variables former are used in quantified form (resolved into distributions with known fixed set of values and related probabilities). In the cases above those distributions are not resolved yet; you could define them resolved from the very beginning by using N parameter, e.g.
A = distribution.continuous(ppf = stats.norm(4, 0.5).ppf, N = 300) # gauss distribution with mean = 4, std = 0.5, resolved into 300 points with quantiles ~= 1/300, 2/300, ... C = distribution.uniform(-1.5, 1.5, N = 200) # uniform distribution from -1.5 to 1.5 approximated by 200 points
Alternatively, you can quantify distributions automatically inside computations. Suppose you have function f = 2*a + 3*b*x + y*sin(c), where a, b, c are stochastic variables with distributions A, B, C, and x,y are scalar variables. To get distribution of f you can use
from FuncDesigner import * A = distribution.normal(4, 0.5) B = distribution.exponential(3, 0.7) C = distribution.uniform(-1.5, 1.5) x, y, a, b, c = oovars('x y a b c') f = 2*a + 3*b*x + y*sin(c) point = oopoint({a:A, b:B, c:C, x:4, y:0.3}, maxDistributionSize = 100) F = f(point) # now you can operate with F as well as with A, B, C, e.g. print(F.mean(), F.std(), F.var()) # (51.908303827572794, 7.4675507109884496, 55.764313621184101) # or mere w/o brackets via capitalized attribute names: print(F.Mean, F.Std, F.Var) F.cdf.plot()
Parameter maxDistributionSize is used to limit size (number of values and related probabilities) of temporary / auxiliary (and final result) distributions obtained during computations. The bigger the value is, the more exact results you will obtain, but it will required time and RAM grows as O(n^2). There are different kinds of convolutions, that are capable of reducing it in cost of some additional rounding errors, but currently it's unimplemented yet and remains in our future plans.
Unfortunately, currently FuncDesigner in some cases can fail on constructing new distribution as superposition of some others if it appears that dependencies of some inputs intersects (you will be informed via FuncDesignerException). For example, F can be 2*a + sin(b) + cos(a) (implicit or explicit FuncDesigner sum is optimized for this situation), can be a*a*b, but cannot be a*b*a, because (a*b) and a are not independent. We have some ideas to improve the situation, but it is not implemented yet, we also keep this one and some other ideas in our future plans.
For numerical optimization, as you can see in the provided examples, parameter maxDistributionSize must be supplied to either prob constructor, e.g. p = NLP(..., maxDistributionSize = 100,...), or assigned afterwards (p.maxDistributionSize = 100), or passed to minimize/maximize/solve method, e.g. r = p.minimize(..., maxDistributionSize = 100,...).
Local nonlinear optimization example
from FuncDesigner import * from openopt import NLP A = distribution.gauss(4, 0.5) # gauss distribution with mean = 4, std = 0.5 # this is same to #from scipy import stats #_a = distribution.continuous(ppf=stats.norm(4, 0.5).ppf) # along with "gauss" you can use "norm" (for scipy.stats compatibility, yet I dislike it due to ambiguity with linalg.norm) # or "normal" B = distribution.exponential(3, 0.7) # for compatibility with scipy.stats you can use "expon" C = distribution.uniform(-1.5, 1.5) # uniform distribution from -1.5 to 1.5 a, b, c = oovars('a b c') x, y, z = oovars('x y z', lb=-1, ub=1) f = x*a**2 + y*b**2 + z*c**2 + (x-1)**2 + y**2 + (z-5)**2 objective = 0.15 * mean(f+2*x) - 20 * z * var(b*exp(x)) + 15*y**2 * var(c*exp(z)) constraints = [ mean(b+y) <= 3.7, std(x*a+z) < 0.4 ] startPoint = {x: 0, y: 0, z: 0, a: A, b: B, c: C} p = NLP(objective, startPoint, constraints = constraints) # select gradient-free solver solver = 'scipy_cobyla' # for unconstrained and box-bounded problems you'd better use bobyqa # see http://openopt.org/NLP for more available solvers r = p.maximize(solver, iprint = 1, maxDistributionSize=700, plot = False) #"plot = True" means real-time graphics output of convergence, requires matplotlib installed ''' Results for v. 0.393, Intel Atom 1.6 GHz: ------------------------- OpenOpt 0.39 ------------------------- solver: scipy_cobyla problem: unnamed type: NLP goal: maximum iter objFunVal log10(maxResidual) 0 3.900e+00 -100.00 1 1.154e+01 -0.00 2 8.491e+01 -0.92 3 5.312e+01 -14.89 4 5.321e+01 -3.32 5 5.312e+01 -6.99 6 5.312e+01 -15.78 7 5.312e+01 -14.89 8 5.312e+01 -14.89 istop: 1000 Solver: Time Elapsed = 0.82 CPU Time Elapsed = 0.81 objFunValue: 53.120866 (feasible, MaxResidual = 1.27676e-15) ''' print(r(x, y, z)) # [0.81121226345864939, 0.0087499573476878291, -1.0]
Global nonlinear optimization example
''' A simple FuncDesigner stochastic optimization example ''' from FuncDesigner import * from openopt import GLP A = distribution.gauss(4, 0.5) # gauss distribution with mean = 4, std = 0.5 # this is same to #from scipy import stats #_a = distribution.continuous(ppf=stats.norm(4, 5).ppf) # along with "gauss" you can use "norm" (for scipy.stats compatibility, yet I dislike it due to ambiguity with linalg.norm) # or "normal" B = distribution.exponential(3, 0.7) # location = 3, scale = 0.7 # for compatibility with scipy.stats you can use "expon" as well C = distribution.uniform(-1.5, 1.5) # uniform distribution from -1.5 to 1.5 a, b, c = oovars('a b c') x, y, z = oovars('x y z', lb=-1, ub=1) f = sin(b) + cos(b) + arcsin(b/100) + arccos(b/100) + arctan(b) +\ (1.5+x + 3*y*z)*cosh(b/ (20+x+y)) + sinh(b/30) + tanh(b) + arctanh(b/100) + arccosh(200+b) + arcsinh(3+b) + \ (x+y+0.4)*exp(b/ (15+x+z)) + sqrt(b+100) + abs(b-2) + log(b+50) + log10(b+100) + log2(b+100) +\ tan(c/50) + x + 2**(a/4 + x + y + z) objective = 0.15 * mean(f+2*x) + x*cos(y+2*z) + z * var(b) * std(c) + y * P(a - z + b*sin(c) > 5) constraints = [ P(a**2 - z + b*c < 4.7) < 0.03, # by default constraint tolerance is 10^-6 (P(c/b + z > sin(x)) > 0.02)(tol = 1e-10), # use tol 10^-10 instead; especially useful for equality constraints mean(b+y) <= 3.5 ] startPoint = {x: 0, y: 0, z: 0, a: A, b: B, c: C} ''' This is multiextremum problem (due to sin, cos etc), thus we have to use global nonlinear solver capable of handling nonlinear constraints (BTW having probability functions P() make it even discontinuous for discrete distribution(s) involved) ''' p = GLP(objective, startPoint, constraints = constraints) solver = 'de' # named after "differential evolution", check http://openopt.org/GLP for other available global solvers r = p.maximize(solver, maxTime = 150, maxDistributionSize=100) ''' output for Intel Atom 1.6 GHz (may differ due to random numbers involved in solver "de") ------------------------- OpenOpt 0.39 ------------------------- solver: de problem: unnamed type: GLP iter objFunVal log10(MaxResidual/ConTol) 0 6.008e+00 8.40 10 7.150e+00 -100.00 20 7.315e+00 -100.00 30 7.384e+00 -100.00 40 7.401e+00 -100.00 50 7.436e+00 -100.00 60 7.514e+00 -100.00 70 7.515e+00 -100.00 80 7.517e+00 -100.00 90 7.517e+00 -100.00 93 7.517e+00 -100.00 istop: 11 (Non-Success Number > maxNonSuccess = 15) Solver: Time Elapsed = 29.76 CPU Time Elapsed = 28.79 objFunValue: 7.516546 (feasible, max(residuals/requiredTolerances) = 0) ''' print(r(x, y, z)) # [0.99771171590186003, -0.15952854483416395, 0.85848779211294957] # let's check constraint values # below we could use c(r.xf) but c(r) is less-to-type and looks better print(P(a**2 - z + b*c < 4.7)(r)) # should be less than 0.03 #0.0200595929361 print(P(c/b + z > sin(x))(r)) # should be greater than 0.02 #0.029969536767 print(mean(b+y)(r)) # should be less than 3.5 #3.49947095196 # we could plot cdf (and pdf for continuous) for any stochastic function wrt the optimal parameters point, e.g. f(r).cdf.plot() # or, for example, (f + sin(x) + 2*f*cos(y+f) + z * P(f<x))(r).cdf.plot()
Example with 15 unknown variables
''' A little bit vectorized FuncDesigner stochastic optimization example with 15 unknowns (defined via oovars(n), not oovar(size=n), thus no any vectorization speedup) oovar(size=n) doesn't work with stochastic problems yet except of some cases where they doesn't interfere in objective or constraints (sticking oovar(size=n) with std, mean, var of a stochastic var/func is ok) ''' from FuncDesigner import * from openopt import NLP A = distribution.normal(4, 0.5) # gauss distribution with mean = 4, std = 0.5 B = distribution.exponential(3, 0.7) # location = 3, scale = 0.7 C = distribution.uniform(-1.5, 1.5) # uniform distribution from -1.5 to 1.5 a = oovars(10) # 10 variables b = oovars(4) # 4 variables c = oovar() # 1 variable x = oovars(10, lb=5, ub=15) # 10 unknowns y = oovars(4, lb=10, ub=20) # 4 unknowns z = oovar() # 1 unknown f = sum(x*a)**2 + sum(y*b) + c**4 + sum(x-1)**2 + sum(y)**2 + sum(y**2) + (z-5)**2 objective = 0.15 * mean(f+2*x) + sum(y) + sum(x) + z**2* std(c) constraints = [ P(sum(a)**2 + sum(b**2) + sum(x) > 50*(z + sum(y))) < 0.5, # by default constraint tolerance is 10^-6 mean(c + a[0] + b[1]+z) >= 15 ] startPoint = { x: [0]*10, # same to numpy.zeros(10); start point will have x[0]=0, x[1]=0, ..., x[9]=0 y: [0]*4, z: 0, a: [A]*10, b: [B]*4, c: C} p = NLP(objective, startPoint, constraints = constraints) solver = 'scipy_cobyla' r = p.minimize(solver, iprint = 5, maxDistributionSize=100) ''' output for Intel Atom 1.6 GHz (may differ due to random numbers involved in solver "de") ------------------------- OpenOpt 0.39 ------------------------- solver: scipy_cobyla problem: unnamed type: NLP goal: minimum iter objFunVal log10(maxResidual) 0 1.890e+01 1.00 5 6.791e+03 -1.65 10 6.790e+03 -4.46 15 6.790e+03 -6.07 17 6.790e+03 -14.75 istop: 1000 Solver: Time Elapsed = 45.84 CPU Time Elapsed = 45.6 objFunValue: 6789.6809 (feasible, MaxResidual = 1.77636e-15) '''
Recommended solvers
- Since v 0.421 you can use gradient-based solvers, such as ALGENCAN, IPOPT, ralg, gsubg etc. Usually they work faster than derivative-free or global (GLP) solvers, e.g. on this example ALGENCAN time elapsed is less than 1 second while scipy_cobyla spend ~20 sec. However pay attention that having function P() in your problem already may bring nonconvexity or other issues to the solver optimization trajectory, thus sometimes you'll have to use derivative-free (scipy_cobyla, BOBYQA) or GLP solvers (e.g. de) instead.
- goldenSection - for 1-dimensional box-bound constrained problems, local optimization
- bobyqa (requires nlopt installed) - for box-bound constrained problems, local optimization
- scipy_cobyla (requires SciPy installed) - for general constrained problems, local optimization
- pswarm (GLP solver) - for box-bound and linear inequalityconstrained problems, global optimization
- de ("differential evolution", GLP solver) - for general constrained problems, global optimization
Future Plans
See also: our general FuturePlans
- Implement a fast convolution and provide its possibility as optional parameter
- Add linear stochastic programming (currently we have only nonlinear)
- Add some more constructors for most common distributions (mostly will be done in order of users requests)
- Make it running with PyPy
- Some possible speedup for some parts of code
- Probably some parallel computations could be implemented
- Along with cdf provide graphics output and other methods for some other funcs (pdf, sf, logpdf, logcdf, etc)
- Provide moments of 3, 4 and higher orders
- Connect gradient-based solvers
Installation
Very often a change in stochastic module leads some changes in FuncDesigner and / or OpenOpt framework, thus it would be a bad decision to spread standalone addon (it could be incompatible with the OpenOpt/FuncDesigner version you have installed). Thus for nearest future the addon will be available for downloading in this webpage and will be updated from time to time, as soon as new essentially improved/enhanced version (with better speed, accuracy and / or new functionality) will be ready.
You can download latest OOSuite SP Edition here. Then you should run install_all.py with admin rights (or develop_all.py to use inplace installation). This version was compiled with Python2. Pay attention that for some functionality SciPy installed is required. It is recommended to take a look at our Install page (for Windows OS you may be interested in some Python distributions like PythonXY, EPD, for Linux you should have python-setuptools installed).
Stochastic calculations in FuncDesigner operate with quite large arrays and matrices, thus to speedup them you can use NumPy built with Intel MKL / AMD ACML.
Changelog
- v. 0.451:
- some bugfixes for OOSuite kernel 0.45
- v. 0.45:
- Minor changes + OOSuite kernel update to 0.45
- v. 0.43:
- v. 0.421:
- v. 0.42:
- function P() can handle argument interpolate (True/False/'auto'), this one is required for mixed discrete-continuous distributions (warning will be printed if it was omitted in the case)
- minor changes
- v. 0.393:
- essential speedup, especially for GLP solver de
- v. 0.392:
- add essential speedup (for some problems - several times) for GLP (global) solvers de and pswarm due to improved vectorization of computations
- some minor changes



