Revisions: /
URL: http://bbinsomnia.info
Date: 2016-11-17

CALCULATING AVERAGE PLAYING TIME WITH WEIGHTS

Distribution of Minutes in ABA League Season 2015-16

Miha Peče

In previous post I calculated ABA League's average playing time for starters and bench players. But because I neglected overtimes, I will recalculate averages, this time with weighted arithmetic mean. This will be just calculation, without analysis.

Again, I will use 2015-16 season ABA League dataset.

In [1]:
import sqlite3
import pandas as pd
In [2]:
conn = sqlite3.connect('../sql/aba_liga2015.sqlite')
In [3]:
sql_pd = "SELECT NAME, MP, scr_game_players.TM, STARTER, TM_MIN FROM scr_game_players, scr_team_vs_team \
          WHERE scr_game_players.ROUND='regular' AND scr_game_players.NumG=scr_team_vs_team.NumG AND \
          scr_game_players.TM=scr_team_vs_team.TM"
df_main = pd.read_sql(sql_pd, conn)
In [4]:
df_start = df_main[(df_main["STARTER"]==1) & (df_main["MP"]!=0)].copy()
df_start['k%']= 200 / df_start["TM_MIN"]
# showing last 5 rows where k<1
df_start[df_start['k%']<1].tail()
Out[4]:
NAME MP TM STARTER TM_MIN k%
4187 Koprivica Miloš 7 Partizan 1 300 0.666667
4188 Wilson Jamar 52 Partizan 1 300 0.666667
4192 Jones Kevin Andrew 55 Partizan 1 300 0.666667
4193 Murić Edo 45 Partizan 1 300 0.666667
4197 Marinković Vanja 31 Partizan 1 300 0.666667
In [5]:
df_weight = df_start.copy()
df_weight['MP'] = df_weight['MP'] * df_weight['k%']
df_weight[df_weight['k%']<1].tail()
Out[5]:
NAME MP TM STARTER TM_MIN k%
4187 Koprivica Miloš 4.666667 Partizan 1 300 0.666667
4188 Wilson Jamar 34.666667 Partizan 1 300 0.666667
4192 Jones Kevin Andrew 36.666667 Partizan 1 300 0.666667
4193 Murić Edo 30.000000 Partizan 1 300 0.666667
4197 Marinković Vanja 20.666667 Partizan 1 300 0.666667
In [6]:
start_mean = df_start['MP'].mean()
start_mean = (int(start_mean) + (start_mean-int(start_mean)) * 0.6) # percentages -> seconds
round(start_mean, 2)
Out[6]:
24.399999999999999
In [7]:
start_w_mean = df_weight['MP'].mean()
start_w_mean = (int(start_w_mean) + (start_w_mean-int(start_w_mean)) * 0.6) # percentages -> seconds
round(start_w_mean, 2)
Out[7]:
24.280000000000001

As we see, we corrected the result for 12 seconds. Using also overtimes in calculation, average minutes for starters droped to 24:28 minutes.

Let's calculate also averages for particular teams, where some results should be interesting. Partizan and Olimpija, for instance, played one match with 4 overtimes.

In [10]:
teams = ('Union Olimpija', 'Cibona', 'Zadar', 'Budućnost', 'Cedevita', 'Crvena Zvezda', 'Igokea', 'Krka',
          'Mega Leks', 'Sutjeska', 'MZT Skopje', 'Partizan', 'Tajfun', 'Metalac')
df_teams_wg = pd.DataFrame()
for team in teams:
    df_tmp = df_weight[(df_weight["TM"] == team)]
    ser = df_tmp["MP"].mean().round(2)
    df_teams_wg[team] = pd.Series(ser, index=["With weights"])
df_teams_wg
Out[10]:
Union Olimpija Cibona Zadar Budućnost Cedevita Crvena Zvezda Igokea Krka Mega Leks Sutjeska MZT Skopje Partizan Tajfun Metalac
With weights 22.49 25.07 24.58 26.85 22.74 22.3 24.47 24.41 24.96 25.52 26.95 23.17 23.96 25.05

Let's combine results in one table for better readability and comparison.

In [11]:
df_teams_lin = pd.DataFrame()
for team in teams:
    df_tmp = df_start[(df_start["TM"] == team)]
    ser = df_tmp["MP"].mean().round(2)
    df_teams_lin[team] = pd.Series(ser, index=["Without weights"])
    
# Combine frames for comparison
df_combine = df_teams_wg.append(df_teams_lin)

# Add total
total = pd.Series([df_weight['MP'].mean(), df_start['MP'].mean()], index=["With weights", "Without weights"])
df_combine["TOTAL"] = total
df_combine = df_combine.transpose().round(2)
df_combine['With weights'] = df_combine['With weights'].apply(lambda x: int(x) + (x-int(x)) * 0.6).round(2)
df_combine['Without weights'] = df_combine['Without weights'].apply(lambda x: int(x) + (x-int(x)) * 0.6).round(2)
df_combine
Out[11]:
With weights Without weights
Union Olimpija 22.29 22.59
Cibona 25.04 25.04
Zadar 24.35 24.35
Budućnost 26.51 27.08
Cedevita 22.44 22.50
Crvena Zvezda 22.18 22.25
Igokea 24.28 24.35
Krka 24.25 24.53
Mega Leks 24.58 25.10
Sutjeska 25.31 25.53
MZT Skopje 26.57 26.57
Partizan 23.10 23.40
Tajfun 23.58 24.10
Metalac 25.03 25.03
TOTAL 24.28 24.40

Concluding with calculation of bench minutes.

In [12]:
df_bench = df_main[(df_main["STARTER"]==0) & (df_main["MP"]!=0)].copy()
df_bench['k%']= 200 / df_bench["TM_MIN"]
df_bench_weight = df_bench.copy()
df_bench_weight['MP'] = df_bench['MP'] * df_bench['k%']
df_bench_weight[df_bench_weight['k%']<1].tail()
Out[12]:
NAME MP TM STARTER TM_MIN k%
4189 Vrabac Adin 2.000000 Partizan 0 300 0.666667
4190 Williams Darrell 10.000000 Partizan 0 300 0.666667
4191 Milutinović Andreja 5.333333 Partizan 0 300 0.666667
4194 Vitkovac Čedomir 27.333333 Partizan 0 300 0.666667
4196 Cvetković Aleksandar 28.666667 Partizan 0 300 0.666667
In [13]:
bench_mean = df_bench['MP'].mean()
bench_mean = (int(bench_mean) + (bench_mean-int(bench_mean)) * 0.6) # percentages -> seconds
round(bench_mean, 2)
Out[13]:
15.039999999999999
In [14]:
bench_w_mean = df_bench_weight['MP'].mean()
bench_w_mean = (int(bench_w_mean) + (bench_w_mean-int(bench_w_mean)) * 0.6) # percentages -> seconds
round(bench_w_mean, 2)
Out[14]:
14.56

Average playing time for bench player was 14:56 minutes. It's 8 seconds less than calculation without weights.