-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add code for solving repeated games. #19
Conversation
@cc7768 This looks very impressive. I haven't had time to look into the details though.. It would be helpful if you can write a short notebook that demonstrates how to use this code. |
@@ -1,10 +1,22 @@ | |||
module Games | |||
|
|||
using Clp | |||
using MathProgBase | |||
using QuantEcon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add these packages to REQUIRE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. These need to go into the REQUIRE -- Thanks for reminding me.
Here is a short notebook that demonstrates the use of the code (https://gist.github.com/cc7768/cbc9cddd15a49630d30ad2e45bfcb739). It is currently in a gist (and you have to use the |
|
||
# Update incentive constraints | ||
b[nH+1] = flow_u_1(rpd, a1, a2) - best_dev_payoff_1(rpd, a2) - δ*_w1 | ||
b[nH+2] = flow_u_2(rpd, a1, a2) - best_dev_payoff_2(rpd, a1) - δ*_w2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid the definition of flow_u_*
is confusing. I would define it as the value without (1-δ)
(flow value!),
and then multiply it by (1-δ)
here. (Then flow_u_*
would depend only on the underlying NormalFormGame
instead of rpd
.) Same for best_dev_payoff_*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. This might help reduce some of the extra helper functions I put in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed both the flow_u
and best_dev
payoff functions to return just the flow utility now. The (1-delta)
is now included when the flow and continuation is put together.
The last step of the (If you remember, this is related to the "vertex enumeration" algorithm for computing Nash equilibria of a two-player normal form game.) |
I hadn't seen that package. That could be very good news! You're right that the last step of the routine is converting from "H-representation" to "V-representation". I just wanted something that worked for now, so (as I'm sure you saw) I cooked up a brute force way to tackle the problem -- The plan was to eventually replace it with something more elegant out of a computational geometry package. If Polyhedra.jl has the tools I need then I'm quite certain will clean some things up. |
Also (as a side note), if the Polyhedra.jl package does some of the things that I hope it does then implementing the inner-hyperplane approximation (we currently only have outer-hyperplane) and the Abreu Sannikov algorithms could be quite easy. |
deal with inequalities associated with Ax \leq b. | ||
|
||
min c ⋅ x | ||
Ax = b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come you convert constraints to equalities? Doesn't ClpSolver
accept '<'
(as in the case in worst_value_i
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. At some point I had a reason for doing this -- It may have been for applying their methods to a different problem.
I'll work through this again, but I think you're right about not needing to build my own slack variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I no longer build slack variables -- A <
works fine.
for player 1 is 5 then any point that awards player 1 higher than 5 | ||
continuation utility is cropped to 5. | ||
""" | ||
function create_unitcircle_points{T<:Real}(rpd::RepeatedGame{2, T}, nH::Int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see what this function is doing. Is this deriving the H-representation of script_W (the rectangle of feasible payoff profiles) given H = unitcircle(nH)
? If that's the case, wouldn't the following work?
H = unitcircle(nH)
HT = H'
p1_extrema = extrema(g.players[1].payoff_array)
p2_extrema = extrema(g.players[2].payoff_array)
W0_vertices = gridmake([p1_extrema...], [p2_extrema...])
C = vec(maximum(W0_vertices * HT, 1))
- Is
Z
used in the main loop? - This does not appear creating unit circle points; to me, a unit circle is a circle with radius one, or the circle with radius one whose center is the origin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I occasionally come up with very poorly named functions (You can ask @sglyon -- He has to rename functions all the time). This appears to be one of those times... A circle with origin
o
and radiusr
is definitely not necessarily a unitcircle... I had put that in there because it is using theunitcircle
function to generate the directional vectors, but it doesn't make any sense in this context. - I don't think
Z
is needed for the inner-loop of the outer-hyperplane approximation algorithm, but I think it is used in the inner-hyperplane approximation algorithm. - I think your code probably generates the same type of rectangle that I end up generating -- I originally didn't crop the circle, but just made the radius big enough to contain that rectangle (and then I realized that I was including a lot of infeasible payoffs so it would make sense to crop it down). The only thing I need to ensure is that the order of the extrema comes out the same using your approach because the initial values for each
C
/Z
matter. Either way, this function can probably be greatly simplified. Thanks for catching this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed this function to initialize_sg_hpl
(sg->subgradient and hpl->hyperplane level). This name fits more with what it is generating.
I also decided not to crop anymore -- It wasn't saving that much time and now it generates a circle with origin o
and radius r
. I have a helper function that chooses r
and o
nicely for me inside of outerapproximation
Approximates the set of equilibrium value set for a repeated game with the | ||
outer hyperplane approximation described by Judd, Yeltekin, Conklin 2002 | ||
""" | ||
function outerapproximation(rpd::RepeatedGame; nH=32, tol=1e-8, maxiter=500, nskipprint=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Currently the function works only for two-player games. So
rpd::RepeatedGame
should berpd::RepeatedGame{2}
, or explicitly throw an error if the input game has more than two players. - In the final version: A
verbose
option should be given, whereverbose=0
would suppress reportings. The default value ofnskipprint
should be much bigger than 1, such as 50 or 100.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes. I meant to restrict the solution methods to
RepeatedGame{2}
, but it seems I left that out. Thanks for catching that -- I'll change that on all of the relevant methods I've added. - Great idea -- I agree
nskipprint = 1
is too few. It had used it while I was debugging some things to try and figure out where the code was breaking down and never changed it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a type alias and changed all relevant RepeatedGame
to RepGame2
(which is a type alias for RepeatedGame{2}
). I will review again in next couple days to double check I got them all.
I also added a verbose option and increased nskipprint
's default to 50. Before merging (I'm still doing some debugging where it is helpful) I will change the verbose default input to false
.
In my class today, a student (@shizejin) presented JYC with reference to your code and notebook. They are well written and very helpful.
|
while (iter < maxiter) && (dist > tol) | ||
# Compute the current worst values for each agent | ||
_w1 = worst_value_1(rpd, H, C) | ||
_w2 = worst_value_2(rpd, H, C) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If [-1, 0]
and [0, -1]
are contained in H
, which is the case if nH
is a multiple of 4, are these values equal to the values of C
that correspond to [-1, 0]
and [0, -1]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That is true.
I hadn't thought about that. I suspect if I restricted people to using multiples of 4 then that might speed the code up in a useful way.
Excellent. I'm glad that the code and notebook were helpful! @thomassargent30 will be happy to hear that -- The goal of putting this together was to provide a useful exposition of the JYC algorithm. I couldn't figure out why the linear programs are failing when delta is too small (I only spent an hour or two working on it though). Hypothetically, the constraint set should just reflect that it is a singleton (i.e. the intersection of the half-planes is just a single point). I'm wondering whether this is possibly some type of numerical issue. I had exactly this issue in mind when I wrote "Build more examples so that we know whether the code needs to be more defensive in certain areas" for the todo list. I'm perfectly happy to use Thanks a lot for putting time into this review. These are all really useful comments. |
To obtain the V-representation from the H-representation of the equilibrium payoff set
I tried to install |
You'll see that I merged the I've also been adding more items to the todo list at the top to address your comments. I think your feedback will continue helping to simplify and clean up the code. |
Made quite a bit of progress on this today. It should be relatively close to being finished and would like to merge it before too long so that I can then address and merge QuantEcon/QuantEcon.notebooks#48. Two issues remain
I'm not sure if these can immediately be addressed, so, if you don't have strong objections, I would like to merge soon in spite of this. We can raise issues on these and maybe someone smarter than me will think of a good way to handle this, but the functionality as is will work for a large portion of two player games and is a useful tool for people to have at their disposal. The second issue also made me think it would be wise to put off writing the innerapproximation because if this issue can't be solved, then we might end up "rolling our own" solution to the problem and I'd like to wait for this to settle before proceeding with other algorithm. |
|
||
""" | ||
function outerapproximation(rpd::RepGame2; nH=32, tol=1e-8, maxiter=500, | ||
verbose=false, nskipprint=50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we add an option like lib::PolyhedraLibrary=default_library()
. I know the functionality for obtaining a default library is missing in Polyhedra
(there is getlibraryfor
but the design is not very good). We could add something like the default mechanism in MPB.
For instance, we would add a function defaultlibraryfor(::T)
that would select the first library that supports computation using type T
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment (and for putting together Polyhedra
!) @blegat.
If I understand correctly, your suggestion is to implement a function that chooses which backend to use for Polyhedra
? I'm currently forcing users to use CDDLibrary (which is probably the backend most should use for now anyways), but it would definitely be nice to give some options.
I hadn't seen MPB's default code before. That looks slick, but I'll need a few minutes to make sense of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What it does is the following: It has a order of preference and it tries to load each package in this order. The library taken is the library of the first package that loads
|
I think the low delta breakdowns aren't a problem now. I dug into it for a few more minutes and it seems like it wasn't ever a problem -- Instead of checking whether all actions are infeasible, I am now checking to make sure that at least one action can result in a feasible solution. We only need one because we use the max and set all infeasible to -Inf (see https://github.com/QuantEcon/Games.jl/blob/6a2a3a641a867d9ce9e94407d3162a022df71350/src/repeated_game.jl#L291-L321). It is now performing as expected (See examples below). I still haven't figured out exactly why the second issue is happening.
|
Cnew[ih] = Cstar | ||
else | ||
warn("Failed to find hyperplane level greater than -Inf") | ||
Cnew[ih] = h1*min_pane_payoffs[1] + h2*min_pane_payoffs[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the logic behind this? Are you assuming the minmax action profile is a Nash equilibrium?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know that at least the NE is feasible so I use its hyperplane levels -- None of my examples run into this block anymore though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, the minmax action profile is not a Nash equilibrium. What happens if these values are replaced by the minimax payoffs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I said was wrong. I was thinking about a different part of the code and was trying to give a quick response before I call it a day.
This is not the minmax action profile. This block of code is only ever entered if the linear program was able to find any feasible action/continuation pair for a specific subgradient. If it fails to find any of feasible action continuation par, then I am plugging in a value that I know is feasible (the smallest pure action NE).
If you think the minmax action profile is a better fit here then I'm happy to test that out. I could also just throw an error here (which might be the safest thing to do). None of my examples enter this else statement anymore -- Before I made some fixes, it checked whether each of the elements of Cstar
were greater than -Inf
, but we only care that at least one is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer throwing an error to using the minmax values. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing an error sounds the safest option as you say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent. I added that to the code.
pane = pure_nash(sg) | ||
no_pane_exists = (length(pane) == 0) | ||
if no_pane_exists | ||
error("No pure action Nash equilibrium exists in stage game") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that min_pane_payoffs
will no longer be used:
-
Would it make sense to write a generator version (if any such exists in Julia) of
pure_nash
and then here go forward as soon as one pure action Nash equilibrium is found? There will be no difference in the worse case, but it should take a strictly smaller amount of time "on average". -
I think it is better to keep the option to skip this part, as was added at one of the previous commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these ideas.
- I will try and write a generator version of
pure_nash
in the next couple days. It definitely could save non-trivial amounts of time for a more complicated game. - I have reintroduced the option to skip the check
I have added the possibility to select a default library in Polyhedra: JuliaPolyhedra/Polyhedra.jl#4 |
Thanks for the addition @blegat. I've incorporated the ability choose what polyhedron library to use into the function. @oyamad I've also incorporated something close to your generator suggestion. The |
@oyamad I think this PR is done. Do you have any additional comments before we merge it? Once this is done, I can do a last pass of cleanup on the repeated games notebook for QuantEcon/QuantEcon.notebooks#48 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added minor comments.
Otherwise looks very good.
|
||
""" | ||
function outerapproximation(rpd::RepGame2; nH=32, tol=1e-8, maxiter=500, | ||
check_pane=true, verbose=false, nskipprint=50, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think check_pure_nash
would be better than check_pane
.
* maxiter: Maximum number of iterations | ||
* verbose: Whether to display updates about iterations and distance | ||
* nskipprint: Number of iterations between printing information (verbose=true) | ||
* pane_check: Whether to perform a check about whether a pure action ne exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pane_check
->check_pane
orcheck_pure_nash
- "pure action ne" -> "pure action Nash equilibrium" or "pure Nash equilibrium"
""" | ||
function outerapproximation(rpd::RepGame2; nH=32, tol=1e-8, maxiter=500, | ||
check_pane=true, verbose=false, nskipprint=50, | ||
plib=getlibraryfor(2, AbstractFloat)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstring for plib
?
|
||
# Given the H-representation `(H, C)` of the computed polytope of | ||
# equilibrium payoff profiles, we obtain its V-representation `vertices`. | ||
# Here we use CDDLib.jl, a Julia wrapper of cdd, through Polyhedra.jl. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this comment.
Hi @oyamad. I've addressed these last few issues and am going to go ahead and merge this. Thank you very much for your suggestions (and the time it took to make them) -- This pull request has been greatly enhanced by your review. |
@cc7768 Thanks, this is a great addition to the library! |
This pull request implements Judd Yeltekin Conklin's 2002 outer hyperplane approximation algorithm which approximates the value sets for repeated games.
There are still some finishing touches that need to be put on this. The exported functions all have at least some sort of documentation, but it needs to be put into a more standard format. Will wait for QuantEcon/QuantEcon.jl#136 to be resolved before changing the doc styles though.
VRepresentation
, we need a certain degree of accuracy. We also need the discount factor to be high enough that the value set is not a singleton.Decide whether to force multiples of 4 for number of approximating points in order to take advantage ofI would like to leave this issue for now and come back to it in the future. I think there might be instances where the LP works better than just finding the right direction level.[-1, 0]
and[0, -1]
being in subgradient set. This would prevent us from solving a linear programming problem for getting worst values. See comments below.<
instead of=
. See comments below.If it turns out to be easy to implement, consider addingLeaving for future work.innerapproximation
in this PR -- Otherwise, leave for a future PR.verbose
tofalse
before mergingIt would be great if people had some time to make a few comments on this and share what they think. @oyamad @spencerlyon2