Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add code for solving repeated games. #19

Merged
merged 25 commits into from
Jan 17, 2017
Merged

WIP: Add code for solving repeated games. #19

merged 25 commits into from
Jan 17, 2017

Conversation

cc7768
Copy link
Member

@cc7768 cc7768 commented Oct 27, 2016

This pull request implements Judd Yeltekin Conklin's 2002 outer hyperplane approximation algorithm which approximates the value sets for repeated games.

There are still some finishing touches that need to be put on this. The exported functions all have at least some sort of documentation, but it needs to be put into a more standard format. Will wait for QuantEcon/QuantEcon.jl#136 to be resolved before changing the doc styles though.

  • Add a check to ensure that at least one pure strategy NE exists
  • Add new packages to the REQUIRE file
  • Build more examples so that we know whether the code needs to be more defensive in certain areas. It seems that the when computing the VRepresentation, we need a certain degree of accuracy. We also need the discount factor to be high enough that the value set is not a singleton.
  • Decide whether to force multiples of 4 for number of approximating points in order to take advantage of [-1, 0] and [0, -1] being in subgradient set. This would prevent us from solving a linear programming problem for getting worst values. See comments below. I would like to leave this issue for now and come back to it in the future. I think there might be instances where the LP works better than just finding the right direction level.
  • Do we need to have the linear programs use slack variables? Can we reduce it to just having an < instead of =. See comments below.
  • Investigate Polyhedra.jl
  • If it turns out to be easy to implement, consider adding innerapproximation in this PR -- Otherwise, leave for a future PR. Leaving for future work.
  • Generator version of PANE. See below.
  • Improve documentation
  • Add tests
  • Change default for verbose to false before merging

It would be great if people had some time to make a few comments on this and share what they think. @oyamad @spencerlyon2

@oyamad
Copy link
Member

oyamad commented Nov 7, 2016

@cc7768 This looks very impressive. I haven't had time to look into the details though..

It would be helpful if you can write a short notebook that demonstrates how to use this code.

@@ -1,10 +1,22 @@
module Games

using Clp
using MathProgBase
using QuantEcon
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add these packages to REQUIRE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. These need to go into the REQUIRE -- Thanks for reminding me.

@cc7768
Copy link
Member Author

cc7768 commented Nov 9, 2016

Here is a short notebook that demonstrates the use of the code (https://gist.github.com/cc7768/cbc9cddd15a49630d30ad2e45bfcb739). It is currently in a gist (and you have to use the repeatedgames branch of Games, but once this is merged then this notebook will move to the QuantEcon.notebooks repo.

@oyamad
Copy link
Member

oyamad commented Nov 10, 2016

@cc7768 Thanks, this is great. (Cc: @shizejin)


# Update incentive constraints
b[nH+1] = flow_u_1(rpd, a1, a2) - best_dev_payoff_1(rpd, a2) - δ*_w1
b[nH+2] = flow_u_2(rpd, a1, a2) - best_dev_payoff_2(rpd, a1) - δ*_w2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid the definition of flow_u_* is confusing. I would define it as the value without (1-δ) (flow value!),
and then multiply it by (1-δ) here. (Then flow_u_* would depend only on the underlying NormalFormGame instead of rpd.) Same for best_dev_payoff_*.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. This might help reduce some of the extra helper functions I put in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed both the flow_u and best_dev payoff functions to return just the flow utility now. The (1-delta) is now included when the flow and continuation is put together.

@oyamad
Copy link
Member

oyamad commented Nov 10, 2016

The last step of the main loop routine seems to be the problem of converting the "H-representation" to the "V-representation" of a polytope (which I am not knowledgeable about at all). You might look at Polyhedra.jl and its sub packages LRSLib.jl and CDDLib.jl (wrappers of C programs).

(If you remember, this is related to the "vertex enumeration" algorithm for computing Nash equilibria of a two-player normal form game.)

@cc7768
Copy link
Member Author

cc7768 commented Nov 10, 2016

I hadn't seen that package. That could be very good news!

You're right that the last step of the routine is converting from "H-representation" to "V-representation". I just wanted something that worked for now, so (as I'm sure you saw) I cooked up a brute force way to tackle the problem -- The plan was to eventually replace it with something more elegant out of a computational geometry package. If Polyhedra.jl has the tools I need then I'm quite certain will clean some things up.

@cc7768
Copy link
Member Author

cc7768 commented Nov 10, 2016

Also (as a side note), if the Polyhedra.jl package does some of the things that I hope it does then implementing the inner-hyperplane approximation (we currently only have outer-hyperplane) and the Abreu Sannikov algorithms could be quite easy.

deal with inequalities associated with Ax \leq b.

min c ⋅ x
Ax = b
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come you convert constraints to equalities? Doesn't ClpSolver accept '<' (as in the case in worst_value_i)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. At some point I had a reason for doing this -- It may have been for applying their methods to a different problem.

I'll work through this again, but I think you're right about not needing to build my own slack variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I no longer build slack variables -- A < works fine.

for player 1 is 5 then any point that awards player 1 higher than 5
continuation utility is cropped to 5.
"""
function create_unitcircle_points{T<:Real}(rpd::RepeatedGame{2, T}, nH::Int,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what this function is doing. Is this deriving the H-representation of script_W (the rectangle of feasible payoff profiles) given H = unitcircle(nH)? If that's the case, wouldn't the following work?

H = unitcircle(nH)
HT = H'

p1_extrema = extrema(g.players[1].payoff_array)
p2_extrema = extrema(g.players[2].payoff_array)
W0_vertices = gridmake([p1_extrema...], [p2_extrema...])

C = vec(maximum(W0_vertices * HT, 1))
  • Is Z used in the main loop?
  • This does not appear creating unit circle points; to me, a unit circle is a circle with radius one, or the circle with radius one whose center is the origin.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I occasionally come up with very poorly named functions (You can ask @sglyon -- He has to rename functions all the time). This appears to be one of those times... A circle with origin o and radius r is definitely not necessarily a unitcircle... I had put that in there because it is using the unitcircle function to generate the directional vectors, but it doesn't make any sense in this context.
  • I don't think Z is needed for the inner-loop of the outer-hyperplane approximation algorithm, but I think it is used in the inner-hyperplane approximation algorithm.
  • I think your code probably generates the same type of rectangle that I end up generating -- I originally didn't crop the circle, but just made the radius big enough to contain that rectangle (and then I realized that I was including a lot of infeasible payoffs so it would make sense to crop it down). The only thing I need to ensure is that the order of the extrema comes out the same using your approach because the initial values for each C/Z matter. Either way, this function can probably be greatly simplified. Thanks for catching this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this function to initialize_sg_hpl (sg->subgradient and hpl->hyperplane level). This name fits more with what it is generating.

I also decided not to crop anymore -- It wasn't saving that much time and now it generates a circle with origin o and radius r. I have a helper function that chooses r and o nicely for me inside of outerapproximation

Approximates the set of equilibrium value set for a repeated game with the
outer hyperplane approximation described by Judd, Yeltekin, Conklin 2002
"""
function outerapproximation(rpd::RepeatedGame; nH=32, tol=1e-8, maxiter=500, nskipprint=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Currently the function works only for two-player games. So rpd::RepeatedGame should be rpd::RepeatedGame{2}, or explicitly throw an error if the input game has more than two players.
  • In the final version: A verbose option should be given, where verbose=0 would suppress reportings. The default value of nskipprint should be much bigger than 1, such as 50 or 100.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Yes. I meant to restrict the solution methods to RepeatedGame{2}, but it seems I left that out. Thanks for catching that -- I'll change that on all of the relevant methods I've added.
  • Great idea -- I agree nskipprint = 1 is too few. It had used it while I was debugging some things to try and figure out where the code was breaking down and never changed it back.

Copy link
Member Author

@cc7768 cc7768 Nov 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a type alias and changed all relevant RepeatedGame to RepGame2 (which is a type alias for RepeatedGame{2}). I will review again in next couple days to double check I got them all.

I also added a verbose option and increased nskipprint's default to 50. Before merging (I'm still doing some debugging where it is helpful) I will change the verbose default input to false.

@oyamad
Copy link
Member

oyamad commented Nov 11, 2016

In my class today, a student (@shizejin) presented JYC with reference to your code and notebook. They are well written and very helpful.

  • If delta is set to a small value, so that V is the singleton set consisting of u(D, D), the program shows lpout.status = :Infeasiblein solving the linear program. Do you have any idea about what's wrong?
  • (I would prefer delta to the unicode character δ. Unicode characters are not easy to find to type, and actually I feel a bit uncomfortable with non-ascii characters, as I, and other Japanese people above certain age, have been trained not to use them, such as Japanese characters, in non-Japanese softwares...)

while (iter < maxiter) && (dist > tol)
# Compute the current worst values for each agent
_w1 = worst_value_1(rpd, H, C)
_w2 = worst_value_2(rpd, H, C)
Copy link
Member

@oyamad oyamad Nov 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If [-1, 0] and [0, -1] are contained in H, which is the case if nH is a multiple of 4, are these values equal to the values of C that correspond to [-1, 0] and [0, -1]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That is true.

I hadn't thought about that. I suspect if I restricted people to using multiples of 4 then that might speed the code up in a useful way.

@cc7768
Copy link
Member Author

cc7768 commented Nov 11, 2016

Excellent. I'm glad that the code and notebook were helpful! @thomassargent30 will be happy to hear that -- The goal of putting this together was to provide a useful exposition of the JYC algorithm.

I couldn't figure out why the linear programs are failing when delta is too small (I only spent an hour or two working on it though). Hypothetically, the constraint set should just reflect that it is a singleton (i.e. the intersection of the half-planes is just a single point). I'm wondering whether this is possibly some type of numerical issue. I had exactly this issue in mind when I wrote "Build more examples so that we know whether the code needs to be more defensive in certain areas" for the todo list.

I'm perfectly happy to use delta instead of δ. My text editor (and Julia REPL) will convert it for me if I type \delta <Tab> but it is probably safer not to put much unicode in the code library in case there are people who don't have text editors that make it easy for them.

Thanks a lot for putting time into this review. These are all really useful comments.

To obtain the V-representation from the H-representation of the equilibrium payoff set
@oyamad
Copy link
Member

oyamad commented Nov 13, 2016

CDDLib seems to work as expected; see 6bd995c (note that the points are stored as rows of the output array).

I tried to install LRSLib, but it failed on my Mac; as noted "Mac OS and Windows are not supported yet".

@cc7768
Copy link
Member Author

cc7768 commented Nov 14, 2016

You'll see that I merged the repeatedgames_cddlib into this branch. This really simplifies the outerapproximation function and I think it will make writing the innerapproximation function quite easy as well.

I've also been adding more items to the todo list at the top to address your comments. I think your feedback will continue helping to simplify and clean up the code.

@cc7768
Copy link
Member Author

cc7768 commented Dec 20, 2016

Made quite a bit of progress on this today. It should be relatively close to being finished and would like to merge it before too long so that I can then address and merge QuantEcon/QuantEcon.notebooks#48.

Two issues remain

  • The code still breaks down if the discount factor is low enough that the value set is a singleton. I haven't found a good way around this.
  • The geometry package that we are calling occasionally breaks down on the C side. I haven't figured out how to address this, but it seems that going to a higher tolerance usually fixes this problem. I have included a warning in the documentation.

I'm not sure if these can immediately be addressed, so, if you don't have strong objections, I would like to merge soon in spite of this. We can raise issues on these and maybe someone smarter than me will think of a good way to handle this, but the functionality as is will work for a large portion of two player games and is a useful tool for people to have at their disposal.

The second issue also made me think it would be wise to put off writing the innerapproximation because if this issue can't be solved, then we might end up "rolling our own" solution to the problem and I'd like to wait for this to settle before proceeding with other algorithm.

Thoughts? @oyamad @shizejin


"""
function outerapproximation(rpd::RepGame2; nH=32, tol=1e-8, maxiter=500,
verbose=false, nskipprint=50)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we add an option like lib::PolyhedraLibrary=default_library(). I know the functionality for obtaining a default library is missing in Polyhedra (there is getlibraryfor but the design is not very good). We could add something like the default mechanism in MPB.
For instance, we would add a function defaultlibraryfor(::T) that would select the first library that supports computation using type T.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment (and for putting together Polyhedra!) @blegat.

If I understand correctly, your suggestion is to implement a function that chooses which backend to use for Polyhedra? I'm currently forcing users to use CDDLibrary (which is probably the backend most should use for now anyways), but it would definitely be nice to give some options.

I hadn't seen MPB's default code before. That looks slick, but I'll need a few minutes to make sense of it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What it does is the following: It has a order of preference and it tries to load each package in this order. The library taken is the library of the first package that loads

@oyamad
Copy link
Member

oyamad commented Dec 21, 2016

@cc7768

  1. For the issue with small delta, did you try other LP solvers?
  2. For the second issue, do you have an example code that generates the break down?

@cc7768
Copy link
Member Author

cc7768 commented Dec 21, 2016

I think the low delta breakdowns aren't a problem now. I dug into it for a few more minutes and it seems like it wasn't ever a problem -- Instead of checking whether all actions are infeasible, I am now checking to make sure that at least one action can result in a feasible solution. We only need one because we use the max and set all infeasible to -Inf (see https://github.com/QuantEcon/Games.jl/blob/6a2a3a641a867d9ce9e94407d3162a022df71350/src/repeated_game.jl#L291-L321). It is now performing as expected (See examples below).

I still haven't figured out exactly why the second issue is happening.

using Games

# Build game and repeated game
pd_payoff = [9.0 1.0
             10.0 3.0]
A = Player(pd_payoff)
B = Player(pd_payoff)
pd = NormalFormGame((A, B))
rpd1 = RepeatedGame(pd, 0.85)
rpd2 = RepeatedGame(pd, 0.25) 
rpd3 = RepeatedGame(pd, 0.15)
rpd4 =  RepeatedGame(pd, 0.05)

# Works
hp_pts1 = outerapproximation(rpd1; nH=128, maxiter=250, tol=1e-5);

# Doesn't work (ERROR: Numerically inconsistent)
# hp_pts2 = outerapproximation(rpd2; nH=128, maxiter=250, tol=1e-5);

# Works
hp_pts2 = outerapproximation(rpd2; nH=128, maxiter=250, tol=1e-9);

# Works and only returns (3, 3) and (9, 9)
hp_pts3 = outerapproximation(rpd3; nH=128, maxiter=250, tol=1e-9);


# Works and only returns (3, 3)
hp_pts4 = outerapproximation(rpd4; nH=128, maxiter=250, tol=1e-9);
using Games

# Build game and repeated game
pd_payoff = [9.0 1.0
             10.0 3.0]
A = Player(pd_payoff)
B = Player(pd_payoff)
pd = NormalFormGame((A, B))
rpd1 = RepeatedGame(pd, 0.85)
rpd2 = RepeatedGame(pd, 0.25) 
rpd3 = RepeatedGame(pd, 0.15)
rpd4 =  RepeatedGame(pd, 0.05)

# Works
hp_pts1 = outerapproximation(rpd1; nH=128, maxiter=250, tol=1e-5);

# Doesn't work (ERROR: Numerically inconsistent)
# hp_pts2 = outerapproximation(rpd2; nH=128, maxiter=250, tol=1e-5);

# Works
hp_pts2 = outerapproximation(rpd2; nH=128, maxiter=250, tol=1e-9);

# Works and only returns (3, 3) and (9, 9)
hp_pts3 = outerapproximation(rpd3; nH=128, maxiter=250, tol=1e-9);


# Works and only returns (3, 3)
hp_pts4 = outerapproximation(rpd4; nH=128, maxiter=250, tol=1e-9);

Cnew[ih] = Cstar
else
warn("Failed to find hyperplane level greater than -Inf")
Cnew[ih] = h1*min_pane_payoffs[1] + h2*min_pane_payoffs[2]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the logic behind this? Are you assuming the minmax action profile is a Nash equilibrium?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know that at least the NE is feasible so I use its hyperplane levels -- None of my examples run into this block anymore though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, the minmax action profile is not a Nash equilibrium. What happens if these values are replaced by the minimax payoffs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I said was wrong. I was thinking about a different part of the code and was trying to give a quick response before I call it a day.

This is not the minmax action profile. This block of code is only ever entered if the linear program was able to find any feasible action/continuation pair for a specific subgradient. If it fails to find any of feasible action continuation par, then I am plugging in a value that I know is feasible (the smallest pure action NE).

If you think the minmax action profile is a better fit here then I'm happy to test that out. I could also just throw an error here (which might be the safest thing to do). None of my examples enter this else statement anymore -- Before I made some fixes, it checked whether each of the elements of Cstar were greater than -Inf, but we only care that at least one is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer throwing an error to using the minmax values. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing an error sounds the safest option as you say.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. I added that to the code.

pane = pure_nash(sg)
no_pane_exists = (length(pane) == 0)
if no_pane_exists
error("No pure action Nash equilibrium exists in stage game")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that min_pane_payoffs will no longer be used:

  • Would it make sense to write a generator version (if any such exists in Julia) of pure_nash and then here go forward as soon as one pure action Nash equilibrium is found? There will be no difference in the worse case, but it should take a strictly smaller amount of time "on average".

  • I think it is better to keep the option to skip this part, as was added at one of the previous commits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these ideas.

  • I will try and write a generator version of pure_nash in the next couple days. It definitely could save non-trivial amounts of time for a more complicated game.
  • I have reintroduced the option to skip the check

@blegat
Copy link

blegat commented Dec 27, 2016

I have added the possibility to select a default library in Polyhedra: JuliaPolyhedra/Polyhedra.jl#4
The syntax is getlibraryfor(dim, T) where dim is the dimension of the polyhedron and T is the type in which the computation need to be performed (typically Rational{BigInt} for exact arithmetic and Float64 for floating point arithmetic). The dim will be useful when we have support for optimized library for a specific dimension such as CHull2d: cc7768/CHull2d.jl#3

@cc7768
Copy link
Member Author

cc7768 commented Jan 7, 2017

Thanks for the addition @blegat. I've incorporated the ability choose what polyhedron library to use into the function.

@oyamad I've also incorporated something close to your generator suggestion. The pure_nash function now has an optional keyword argument which lets you specify how many PANE you would like to find. I set it to 1 when it is called in the outerapproximation function. The function will stop searching as soon as it finds one.

@cc7768
Copy link
Member Author

cc7768 commented Jan 16, 2017

@oyamad I think this PR is done. Do you have any additional comments before we merge it? Once this is done, I can do a last pass of cleanup on the repeated games notebook for QuantEcon/QuantEcon.notebooks#48

Copy link
Member

@oyamad oyamad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added minor comments.
Otherwise looks very good.


"""
function outerapproximation(rpd::RepGame2; nH=32, tol=1e-8, maxiter=500,
check_pane=true, verbose=false, nskipprint=50,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think check_pure_nash would be better than check_pane.

* maxiter: Maximum number of iterations
* verbose: Whether to display updates about iterations and distance
* nskipprint: Number of iterations between printing information (verbose=true)
* pane_check: Whether to perform a check about whether a pure action ne exists
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • pane_check -> check_pane or check_pure_nash
  • "pure action ne" -> "pure action Nash equilibrium" or "pure Nash equilibrium"

"""
function outerapproximation(rpd::RepGame2; nH=32, tol=1e-8, maxiter=500,
check_pane=true, verbose=false, nskipprint=50,
plib=getlibraryfor(2, AbstractFloat))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring for plib?


# Given the H-representation `(H, C)` of the computed polytope of
# equilibrium payoff profiles, we obtain its V-representation `vertices`.
# Here we use CDDLib.jl, a Julia wrapper of cdd, through Polyhedra.jl.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this comment.

@cc7768
Copy link
Member Author

cc7768 commented Jan 17, 2017

Hi @oyamad. I've addressed these last few issues and am going to go ahead and merge this.

Thank you very much for your suggestions (and the time it took to make them) -- This pull request has been greatly enhanced by your review.

@oyamad
Copy link
Member

oyamad commented Jan 17, 2017

@cc7768 Thanks, this is a great addition to the library!

@cc7768 cc7768 merged commit fd1c11d into master Jan 17, 2017
@cc7768 cc7768 deleted the repeatedgames branch January 17, 2017 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants