Math Problem?
Moderators: Aergis, Invisusira
5 posts
• Page 1 of 1
Math Problem?
Let's say I'm presented with an opportunity once every hour.
There are 7 possible opportunities, and it's completely random which I'll be presented with each hour.
For each of the 7 possible opportunities, I can make one of 7 different choices.
Each choice has a variety of consequences, but no matter what, every choice will cost me some points.
If at any time, the total of my points drops to 0 or less, I lose out on everything completely.
While each of the 7 opportunities presents me with 7 different choices, and the outcomes of each choice are different, if the same opportunity arises again, I will get the exact same result for making a given choice.
For example:
Opportunity 1:
Choice 1 = +5a +3b -2c +0d -10 points
Choice 2 = +3a -8b +3c +4d - 1 points
Choice 3 = -3a +0b +1c -2d - 6 points
so on and so forth........
For Opportunity 2, each choice will represent a gain or loss of different values, but each choice for that opportunity will always have the same result.
To recap:
Opps random on the hour. Choice results are fixed per Opp.
I start with 100 points. How do I calculate the best choice per opportunity, and does it change based on my current point standing??
IE: Given Choice one above, if my current points are 9, I can't choose Choice 1 or I lose.
The purpose of making each choice at each given opportunity varies as well.
For one round I may have the goal of maximizing my final value of A, but for the next round the goal might be to maximize the sum of the final values of B and D.
There are 7 possible opportunities, and it's completely random which I'll be presented with each hour.
For each of the 7 possible opportunities, I can make one of 7 different choices.
Each choice has a variety of consequences, but no matter what, every choice will cost me some points.
If at any time, the total of my points drops to 0 or less, I lose out on everything completely.
While each of the 7 opportunities presents me with 7 different choices, and the outcomes of each choice are different, if the same opportunity arises again, I will get the exact same result for making a given choice.
For example:
Opportunity 1:
Choice 1 = +5a +3b -2c +0d -10 points
Choice 2 = +3a -8b +3c +4d - 1 points
Choice 3 = -3a +0b +1c -2d - 6 points
so on and so forth........
For Opportunity 2, each choice will represent a gain or loss of different values, but each choice for that opportunity will always have the same result.
To recap:
Opps random on the hour. Choice results are fixed per Opp.
I start with 100 points. How do I calculate the best choice per opportunity, and does it change based on my current point standing??
IE: Given Choice one above, if my current points are 9, I can't choose Choice 1 or I lose.
The purpose of making each choice at each given opportunity varies as well.
For one round I may have the goal of maximizing my final value of A, but for the next round the goal might be to maximize the sum of the final values of B and D.

Isetnefret . . - . . Gavoryn
Those of you on the port side of the plane can look out and see the Grand Canyon. Those of you on the starboard can look out and see a cloud shaped like a horsey.
-

Isetnefret - Posts: 1083
- Joined: Wed Dec 24, 2008 3:48 pm
- Location: Columbia, MO
Re: Math Problem?
Isetnefret wrote:Each choice has a variety of consequences, but no matter what, every choice will cost me some points.
If at any time, the total of my points drops to 0 or less, I lose out on everything completely.
if the same opportunity arises again, I will get the exact same result for making a given choice.
So you have a 7x7 matrix of opportunities x choices, choices cost points to "buy", and have varying payoffs.
What's the objective? of the game? I'm assuming the goal is to maximize points, or reach a point goal in the minimum time.
Does a choice always result in net point loss, or do you sometimes gain more than it costs? I will assume that some choices yield a net gain (win points), and some choices have a net loss (lose points). If you can't ever gain points, then your strategy seems to minimize point loss.
Assuming that we can gain points, my strategy is thus:
- Keep track of a results matrix, correlating choices with opportunities.
- Choose a threshold of safety, below which we play more conservatively.
TLDR:
Above threshold: Unknown > Positive > Minimize Loss
Below threshold: Positive > Unknown > Minimize Loss
Win = results matrix is full, or we reach a point goal.
- If I know all of the results (results matrix is populated fully), choose the one with the highest point yield. (This maximizes returns, and basically means I've won, or will eventually.)
- When low on points (at/below threshold), choose the bet positive result. If no results yet yield positive values, pick something unknown: it might be better than a loss. If nothing is unknown for this opportunity, minimize losses.
- When points are high (above threshold), we're relatively "safe": we feel unlikely to lose all our points. So, choose something we haven't chosen before. This gets us closer to our goal.
- Once we find a choice that yields a positive value, our points will basically fluctuate around our comfort threshold until we've filled our result matrix. It's possible we might not do that very well, though. The fact that our choices are constrained to only one column at a time makes this more likely to have optimized our choices for a given opportunity, while still having unsolved opportunities.
A meta-strategy would be to experiment and find out what good thresholds are (50%? 75%?), as well as how likely we are to actually find a positive result. Consider that we're likely to spend several rounds where our only choices available will result in a net loss.
edit:
I don't think this is changed by the fact that in a given round (whether of play or per-opportunity) your evaluation function of "winning" makes a difference. Plan in the manner in which your evaluation function yields the most points when playing conservatively, or Play to Learn when playing aggressively, as dictated by your threshold of points.
-

Kelaan - Posts: 4036
- Joined: Thu Jan 03, 2008 12:01 pm
Re: Math Problem?
Okays, so to give a more detailed implementation:

A large variance in payoffs (e.g. 50) can easily yield game overs if a threshold is not suitably conservative. A small ceiling (10) in payoff magnitude means it's nearly impossible (unless the payoffs are not random
) to get a game-over.
Full code (highlighted) if you're interested.

A large variance in payoffs (e.g. 50) can easily yield game overs if a threshold is not suitably conservative. A small ceiling (10) in payoff magnitude means it's nearly impossible (unless the payoffs are not random
Full code (highlighted) if you're interested.
-

Kelaan - Posts: 4036
- Joined: Thu Jan 03, 2008 12:01 pm
Re: Math Problem?
I think what he is saying is that you gain/lose "victory points" a/b/c/d and lose "money points" every time you choose an option.
Can you choose not to play? (If so, you can just pass until a more favourable opportunity comes up.)
Now, until you have learned what each choice does you are at risk of loosing all your money on an unfavourable choice. Since you have no way of knowing what the outcome of a choice is until you choose it, the only strategy comes in do you choose something you know is moderately good, or do you plump for a new option?
I know that the result for a slightly different puzzle is to examine (1/e) of the options then choose the next choice which is larger than the largest value seen, but that is for the case where you can examine choices as they appear, and accept or reject them, but can only accept one. Your puzzle is similar in nature, in a "are the unknown options better than the one I know?" way.
Can you choose not to play? (If so, you can just pass until a more favourable opportunity comes up.)
Now, until you have learned what each choice does you are at risk of loosing all your money on an unfavourable choice. Since you have no way of knowing what the outcome of a choice is until you choose it, the only strategy comes in do you choose something you know is moderately good, or do you plump for a new option?
I know that the result for a slightly different puzzle is to examine (1/e) of the options then choose the next choice which is larger than the largest value seen, but that is for the case where you can examine choices as they appear, and accept or reject them, but can only accept one. Your puzzle is similar in nature, in a "are the unknown options better than the one I know?" way.
- Candiru
- Posts: 2479
- Joined: Mon May 28, 2007 12:21 pm
Re: Math Problem?
Candiru wrote:Now, until you have learned what each choice does you are at risk of loosing all your money on an unfavourable choice. Since you have no way of knowing what the outcome of a choice is until you choose it, the only strategy comes in do you choose something you know is moderately good, or do you plump for a new option?
I assume you can't just pass: you must choose, and pay the cost.
I assume that one's rewards for victory are the same resource used to pay for choices. Otherwise, you have a strictly limited number of choices you'll be allowed to make.
If you have no known "good" options, pick something new. It might not be as bad.
If you pick something minimally bad, or just barely good because it's the only thing you know of, you'll Lose More Slowly ... but are unlikely to win.
-

Kelaan - Posts: 4036
- Joined: Thu Jan 03, 2008 12:01 pm
5 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 3 guests
