Optimizing Course Scheduling at USC Marshall School of Business

18 minute read

Photo by USC Marshall
Programming Language: Python

Problem Statement

The scheduling of courses and classrooms at USC Marshall is a complicated systematic procedure with lots of constraints. Under this situation, the tradeoff between requirements of different parties involved and limited resources of classrooms during the prime-time (10am – 4pm) can be a conundrum.

After analyzing historical data, we found that in some departments, classes with low seat utilization rate indicating limited large classroom resources were not well used. However, in other departments, their courses with such high seat utilization rates denoted that classrooms were almost fully occupied and failed to provide students with a good learning experience.

Therefore, we would like to implement a prescriptive model to this case in order to help the Marshall administration office generate a satisfying course schedule so that for each Marshall class, the final number of enrolled students will be as close to the size of the classroom as possible. The objective of this optimization is to maximize the average seat utilization rate of all the courses offered by Marshall and the course arranged during prime-time (10am – 4pm) at the same time.

Data Cleaning and Wrangling

a. Import Data

We started by importing libraries, raw datasets and read them. The raw datasets here include not only course schedules in Marshall but also from other schools such as Public Policy and Law. We used Gurobi package to create a prescriptive model. Gurobi is one the most robust optimization solver for linear programming (LP) and Mixed Integer Programming (MIP) that we can use to solve this problem.

import gurobipy as grb
import pandas as pd
import numpy as np
# Read Excel files
data = "Marshall_Course_Enrollment_1516_1617.xlsx"
data2 = "Marshall_Room_Capacity_Chart.xlsx"
# Read sheets needed
schedules = pd.read_excel(data, sheet_name="Schedules")
room = pd.read_excel(data2, sheet_name="Room", index_col=0)

We converted time into decimal format in hours for easy calculations.

def convert(inputTime):
    try:
        hh, mm, ss = str(inputTime).split(':')
        ans = (float(hh)+float(mm)/float(60))
    except:
        ans = 0
    return ans

We defined a function to calculate the number of days each course requires.

def n_o_d(Days):
    try:
        ans = len(Days)
    except:
        ans = 9999
    return ans

c. Convert the original data

# Apply the convert function to begin time
schedules['First Begin Time'] = schedules['First Begin Time'].apply(convert)
# Apply the convert function to end time
schedules['First End Time'] = schedules['First End Time'].apply(convert)
# Calculate time slots needed for each course
schedules['Slots'] = np.ceil((schedules['First End Time'] - schedules['First Begin Time'])/2)
# Calculate number of days each course requires
schedules['Days per week'] = schedules['First Days'].apply(n_o_d)

d. Data cleaning

# Filter data for only courses in 2017 spring
schedules = schedules[schedules['Term'] == 20171]
# List of marshall departments
dept = ["ACCT", "BAEP", "BUCO", "DSO", "FBE", "MKT", "MOR"]
# Filter data based on the list
schedules = schedules[schedules['Department'].isin(dept)]
# Exclude online courses and courses in office
schedules = schedules[schedules['First Room'] != 'ONLINE']
schedules = schedules[schedules['First Room'] != 'OFFICE']
# Exclude courses with too large registered count that marshall classrooms cannot accommodate
schedules = schedules[schedules['Reg Count'] <= max(room.loc[:,'Size'])]
# Exclude courses with unknown duration
schedules = schedules[schedules['Slots'] != 0]
# Exclude courses that require more than two hours per day
schedules = schedules[schedules['Slots'] <= 1]
# Exclude courses that require more than two days per week
schedules = schedules[schedules['Days per week'] <= 2]
# Reindex schedules
schedules.index = range(0,len(schedules))

Assumptions and Proposal Evaluation

Before the analysis, we made five main assumptions to shrink the scope and simplify our model. The five assumptions are as follows:

The analysis is only limited to courses within Marshall
One course can only require two days per week at most
The time slot is 2 hours (to save computational power)
The registration count for each course is similar to last year
Seat utilization rate and prime time utilization rate are two metrics that valued equally

To evaluate our proposal, we used readily available historical data (excel files) including Marshall_Course_Enrollment_1516_1617 and Marshall_Room_Capacity_Chart as our data inputs. We exclude those courses that are outside of Marshall or longer than 2 hours to save the computational time and narrow the scope of the problem. Also, we chose Spring 2017 as our historical data for the course and enrollment information which can serve as a reference for class scheduling of Spring 2018.

Diagnostic Analyses

We wanted to know the average utilization rate and percentage of students taking courses in the prime-time in Spring 2017.

a. Before Optimization: Average Seat Utilization Rate

# Join datasets with key 'First Room' for schedules and key 'Room' for room
data = (pd.merge(schedules, room, how = 'left', left_on= 'First Room', right_index=True)
       [['Term','Section','Department','First Begin Time','First End Time','First Room',
         'Reg Count','Seats','Size']]
       ) # Size is from room dataset not schedules dataset
# Filter out courses held in rooms outside Marshall
data = data[data['Size'].notnull()]
# Define a dictionary for the dataframe aggregation
aggregations = {
    'Registered':'sum',
    'Size':'sum'
}
data = data.rename(columns={'Reg Count':'Registered'})
output = data.agg(aggregations)
print('Average seat utilization rate in 20171: {0:.2f}' .format(output.Registered/output.Size))

Average seat utilization rate in 20171: 0.67

b. Before Optimization: Percentage of Students Taking Courses in Prime Time

# Filter out courses held NOT in Prime Time
primeData = data.drop(data[(data['First Begin Time'] < float(10)) | (data['First End Time'] > float(16))].index)
outputPrime = primeData.agg(aggregations)
print('Percentage of students taking courses in prime time in 20171: {0:.2f}'
      .format(outputPrime.Registered/output.Registered))

Percentage of students taking courses in prime time in 20171: 0.59

Formulation

1. Performance metrics

There are two metrics of performance in our analysis. One is to represent the benefit of the administration office and the other reflects the interest of student and professors. The average seat utilization rate is the average of seat utilization rate for each course. As this rate increases, it will be beneficial for all the resources such as professor, classroom, and other operational resources. The prime-time (10am – 4pm) utilization rate measures the percentage of students taking courses in prime time. This metric can reflect the satisfaction degree both for students and for professor since the time slots in prime time are those time that more people prefer to take. We combine these two metrics with equal weights and obtain an overall rate to quantify the performance of class scheduling throughout the semester.

2. Data

I = {“M”, “T”, “W”, “H”, “F”}: set of days of week.
J: {“8”, ”10”, ……, “20”}: set of time slots, each with 2 hours long.
$J_{P}$ : set of time slots in prime time.
K: set of classrooms.
Z: set of courses.
$r_{z}$ : registered count of each course.
$n_{z}$ : number of days each course requires.
$s_{k}$ : size of each classroom.
C: total number of courses.
$w_{a}$ , $w_{b}$ : Weights of two key metrics in the objective function, $w_{a} + w_{b}$ = 1.

3. Decision Variable

$x_{ijkz}$ : Whether course z is assigned to i day of week and j time slot in classroom k. (binary)

4. MIP

Optimization Model

1. Input data

I = ['M','T','W','H','F'] # day of week
J = np.arange(8, 22, 2) # time slot
Jprime = np.arange(10, 16, 2) # time slot in prime time
K = room.index # classroom

# course (with different sections)
Z = []
for z in range(0,len(schedules.index)):
    Z.append(schedules.loc[z, 'Course'] + ' ' + str(schedules.loc[z, 'Section']))

# reindex by using the combined course name and section so that each index is unique
schedules.index = Z

# registered count of each course
r = {}
for z in Z:
    r[z] = schedules.loc[z, 'Reg Count']

# number of days each course requires
n = {}
for z in Z:
    n[z] = schedules.loc[z, 'Days per week']

# size of classroom
s = {}
for k in K:
    s[k] = room.loc[k, 'Size']

2. Build a model

# Our decision variable x[i,j,k,z] is a binary variable of
# whether course z is assigned to ith day of week and jth time slot in classroom k
mod=grb.Model()
x={}
for i in I:
    for j in J:
        for k in K:
            for z in Z:
                x[i,j,k,z] = mod.addVar(vtype = grb.GRB.BINARY, name = 'x[{0},{1},{2},{3}]'.format(i, j, k, z))

3. Objective function

Important to note: Our objective is to maximize the average seat utilization rate of all the courses offered by Marshall and at the same time maximize the percentage of students that take courses in the prime time (10:00-16:00) we assign equal weight (0.5,0.5) to both ratio in our final objective value.

mod.setObjective(0.5 * ((sum((x[i,j,k,z] / n[z]) * (r[z] / s[k]) for i in I for j in J for k in K for z in Z))/len(Z)) +
                 0.5 * ((sum((x[i,j,k,z] / n[z]) * r[z] for i in I for j in Jprime for k in K for z in Z)) /
                        (sum(r[z] for z in Z))), sense = grb.GRB.MAXIMIZE)

4. Define some constraints

Constraints:Classroom capacity

# The size of the classroom assigned should be greater or equal to the registered count of the course
for i in I:
    for j in J:
        for k in K:
            for z in Z:
                mod.addConstr(x[i,j,k,z] * (s[k] - r[z]) >= 0,
                name = 'Number of seats in {0} is greater or equal to number of registered students in {1}'.format(k, z))

Constraints:Classroom availability

# Each classroom in each time slot in each day of week can only be assigned no more than once to a course
for i in I:
    for j in J:
        for k in K:
            mod.addConstr(sum(x[i,j,k,z] for z in Z) <= 1,
                name = 'Classroom {0} on {1} in time slot {2} is assigned to no more than one class'.format(k, i, j))

Constraints:Course duration

# Each course z should be assigned to exact n[z] time slots
for z in Z:
    mod.addConstr(sum(x[i,j,k,z] for i in I for j in J for k in K) == n[z],
                name = 'Course {0} is assigned to exact 1 time slot'.format(z))

Constraint:Courses taught in two days

# For those courses having classes two days per week, if they are assigned to a time slot in a classroom in a day,
# they will be assigned to the same time slot in the same classroom two days later.
# And for those courses with two classes per week, they can't be assigned to Friday
for z in Z:
    if n[z] == 2:
        for j in J:
            for k in K:            
                mod.addConstr(x['M',j,k,z] == x['W',j,k,z])
                mod.addConstr(x['T',j,k,z] == x['H',j,k,z])
                mod.addConstr(x['F',j,k,z] == 0)

5. Solve the Output

mod.setParam('OutputFlag',False)
mod.optimize()

# Calculate the average seat utilization rate
avg_utilization_rate = (sum((x[i,j,k,z].x / n[z]) * (r[z] / s[k]) for i in I for j in J for k in K for z in Z))/len(Z)
# Calculate percentage of students taking courses in prime time
percent_prime = (sum((x[i,j,k,z].x / n[z]) * r[z] for i in I for j in Jprime for k in K for z in Z)) / (sum(r[z] for z in Z))

# Building a dataframe for the output
Objective_Value = ['Optimal Objective', '{0:.2f}'.format(mod.ObjVal)]
Output_rate = ['Average seat utilization rate', '{0:.2f}'.format(avg_utilization_rate)]
Output_percentage = ['Percentage of students taking courses in prime time', '{0:.2f}'.format(percent_prime)]
Summary = pd.DataFrame([Objective_Value, Output_rate, Output_percentage], columns = ['', 'Value'])
Summary

		Value
0	Optimal Objective	0.89
1	Average seat utilization rate	0.86
2	Percentage of students taking courses in prime...	0.92

Solution

# Building a table via a list of lists
SolutionTable = []
for z in Z:
    for k in K:
        for i in I:
            for j in J:
                SolutionTable.append([z, k, i+' '+str(j), r[z], s[k], n[z], x[i,j,k,z].x])
# Transforming table to data frame
Solution = pd.DataFrame(SolutionTable, columns = ['Course', 'Classroom', 'Time', 'Reg Count', 'Size', 'Days', 'Assignment'])
Solution = Solution.loc[Solution['Assignment'] != 0]
Solution

	Course	Classroom	Time	Reg Count	Size	Days	Assignment
731	ACCT-370 14029	JFF LL105	F 14	130	149	1	1.0
3013	ACCT-370 14028	JKP202	M 14	37	54	2	1.0
3027	ACCT-370 14028	JKP202	W 14	37	54	2	1.0
3300	ACCT-370 14026	ACC303	T 14	46	46	2	1.0
3314	ACCT-370 14026	ACC303	H 14	46	46	2	1.0
5610	ACCT-370 14027	JFF240	T 14	47	48	2	1.0
5624	ACCT-370 14027	JFF240	H 14	47	48	2	1.0
7001	ACCT-371 14044	JFF LL105	M 10	128	149	1	1.0
9034	ACCT-371 14040	JFF331	M 16	25	36	2	1.0
9048	ACCT-371 14040	JFF331	W 16	25	36	2	1.0
9528	ACCT-371 14042	ACC205	T 10	32	36	2	1.0
9542	ACCT-371 14042	ACC205	H 10	32	36	2	1.0
12119	ACCT-371 14041	JFF327	T 12	33	36	2	1.0
12133	ACCT-371 14041	JFF327	H 12	33	36	2	1.0
13518	ACCT-371 14043	JFF241	T 10	38	48	2	1.0
13532	ACCT-371 14043	JFF241	H 10	38	48	2	1.0
15442	ACCT-372 14052	JFF417	T 8	27	36	2	1.0
15456	ACCT-372 14052	JFF417	H 8	27	36	2	1.0
16915	ACCT-372 14050	JFF331	T 14	35	36	2	1.0
16929	ACCT-372 14050	JFF331	H 14	35	36	2	1.0
17538	ACCT-372 14051	ACC310	M 14	37	54	2	1.0
17552	ACCT-372 14051	ACC310	W 14	37	54	2	1.0
19666	ACCT-373 14058	JFF LL125	F 14	80	101	1	1.0
20756	ACCT-373 14057	BRI202	M 10	40	42	2	1.0
20770	ACCT-373 14057	BRI202	W 10	40	42	2	1.0
22332	ACCT-373 14056	BRI202	M 12	40	42	2	1.0
22346	ACCT-373 14056	BRI202	W 12	40	42	2	1.0
24257	ACCT-374 14060	JFF LL102	M 12	46	48	2	1.0
24271	ACCT-374 14060	JFF LL102	W 12	46	48	2	1.0
26505	ACCT-374 14061	JKP102	T 14	52	52	2	1.0
...	...	...	...	...	...	...	...
655453	WRIT-340 66719	ACC312	T 10	19	20	2	1.0
655467	WRIT-340 66719	ACC312	H 10	19	20	2	1.0
657021	WRIT-340 66789	ACC312	M 10	19	20	2	1.0
657035	WRIT-340 66789	ACC312	W 10	19	20	2	1.0
659334	WRIT-340 66736	JFF313	M 16	19	20	2	1.0
659348	WRIT-340 66736	JFF313	W 16	19	20	2	1.0
660881	WRIT-340 66748	JFF312	T 16	19	20	2	1.0
660895	WRIT-340 66748	JFF312	H 16	19	20	2	1.0
662450	WRIT-340 66742	JFF312	M 18	19	20	2	1.0
662464	WRIT-340 66742	JFF312	W 18	19	20	2	1.0
664065	WRIT-340 66725	JFF313	T 14	19	20	2	1.0
664079	WRIT-340 66725	JFF313	H 14	19	20	2	1.0
665641	WRIT-340 66763	JFF313	T 16	19	20	2	1.0
665655	WRIT-340 66763	JFF313	H 16	19	20	2	1.0
666472	WRIT-340 66734	ACC312	M 12	19	20	2	1.0
666486	WRIT-340 66734	ACC312	W 12	19	20	2	1.0
668746	WRIT-340 66767	JFF312	M 10	19	20	2	1.0
668760	WRIT-340 66767	JFF312	W 10	19	20	2	1.0
670322	WRIT-340 66740	JFF312	M 12	19	20	2	1.0
670336	WRIT-340 66740	JFF312	W 12	19	20	2	1.0
671206	WRIT-340 66728	ACC312	T 16	19	20	2	1.0
671220	WRIT-340 66728	ACC312	H 16	19	20	2	1.0
673512	WRIT-340 66787	JFF313	T 8	19	20	2	1.0
673526	WRIT-340 66787	JFF313	H 8	19	20	2	1.0
675058	WRIT-340 66714	JFF312	T 20	19	20	2	1.0
675072	WRIT-340 66714	JFF312	H 20	19	20	2	1.0
676657	WRIT-340 66777	JFF313	M 12	19	20	2	1.0
676671	WRIT-340 66777	JFF313	W 12	19	20	2	1.0
677502	WRIT-340 66781	ACC312	T 8	19	20	2	1.0
677516	WRIT-340 66781	ACC312	H 8	19	20	2	1.0

740 rows × 7 columns

# Concatenate day of week and time slot in each day
Time = []
for i in I:
    for j in J:
        Time.append(i+' '+str(j))

# Create a data frame with time as index and classroom as columns
ScheduleTable = pd.DataFrame(index = Time, columns = K)
# Assign blank value into each cell
ScheduleTable = ScheduleTable.fillna('')

# If a course is assigned to a specific time and classroom, fill it into corresponding cell in the data frame
for t in Time:
    for k in K:
        for a in Solution.index:
            if str(Solution.loc[a,'Time']) == str(t):
                if str(Solution.loc[a,'Classroom']) == str(k):
                    ScheduleTable.loc[t,k] = Solution.loc[a,'Course']
ScheduleTable

Room	ACC 306B	ACC201	ACC205	ACC236	ACC303	ACC306B	ACC310	ACC312	BRI202	BRI202A	...	JFF416	JFF417	JKP102	JKP104	JKP110	JKP112	JKP202	JKP204	JKP210	JKP212
M 8			FBE-529 15403					WRIT-340 66785		BUAD-302 14681	...
M 10	ACCT-582 14289	BAEP-451 14378	BUAD-302 14662	MKT-599 16564	BUAD-281 14524	ACCT-574 14202	BUAD-281 14520	WRIT-340 66789	ACCT-373 14057	BUAD-304 14728	...	BUAD-497 15104	BUAD-306 14786	ACCT-470 14116	BAEP-451 14375	ECON-351 26358	ECON-352 26367	ACCT-430 14145	DSO-530 16276	FBE-391 15310	BUAD-311 14904
M 12	GSBA-612 16111	BUAD-497 15108	BUAD-302 14654	ACCT-473 14136	BUAD-497 15110		ACCT-473 14137	WRIT-340 66734	ACCT-373 14056	BUAD-310 14892	...	MKT-556 16537	BUAD-302 14650	DSO-547 16280	ACCT-470 14115	MOR-588 16720	FBE-421 15324	BUAD-306 14782	ACCT-470 14117	BUAD-311T 14906	MOR-421 16677
M 14		BUAD-281 14530	BAEP-554 14448	ACCT-581 14277	MKT-450 16496		ACCT-372 14051	WRIT-340 66769	FBE-554 15425	BUAD-310 14897	...	BUAD-307 14848	BAEP-551 14446	BUAD-280 14509	BUAD-280 14507	FBE-421 15325	ECON-352 26368	ACCT-370 14028	BUAD-280 14513	FBE-441 15362	ECON-352 26363
M 16			DSO-582 16289					ACCT-574 14203		BUAD-302 14677	...		ACCT-581 14276
M 18			ACCT-580T 14273					WRIT-340 66779		BUAD-302T 14704	...		MOR-598 16730
M 20			BUAD-305 14768					WRIT-340 66716		MKT-445 16493	...		FBE-558 15440
T 8			ACCT-430 14146					WRIT-340 66781		BUAD-302 14683	...		ACCT-372 14052
T 10		BUAD-281 14521	ACCT-371 14042	BUAD-306 14787	FBE-557 15436	BUAD-307 14822	ACCT-474 14140	WRIT-340 66719	MOR-463 16673	BUAD-304 14751	...	BUAD-497 15092	ACCT-416 14105	MKT-533 16530	DSO-510 16302	ECON-352 26365	BUAD-311 14912	ACCT-377 14067	FBE-535 15417	BUAD-311 14903	BUAD-311 14905
T 12		BAEP-451 14379	ACCT-530 14207	MOR-431 16671	BUAD-281 14528		ACCT-528 14242	BUAD-305 14766	ACCT-410 14003	BUAD-304 14755	...	MKT-405 16469	MOR-469 16680	DSO-570 16298	ECON-351 26349	ECON-351 26348	FBE-458 15367	BUAD-280 14508	ACCT-568T 14246	ACCT-474 14141	BUAD-302 14685
T 14		DSO-424 16218	BUAD-201 14486	ACCT-410 14004	ACCT-370 14026		ACCT-528 14228	WRIT-340 66791	MKT-530 16525	BUAD-310 14923	...	MKT-405 16471	BUAD-302T 14701	ACCT-374 14061	ECON-351 26350	BUAD-306 14780	BUAD-306 14783	FBE-400 15315	MOR-579 16725	BUAD-311 14902	ECON-351 26356
T 16								WRIT-340 66728		BUAD-302 14646	...		MKT-525 16518
T 18			DSO-433 16227					WRIT-340 66715		FBE-470 15380	...
T 20			BAEP-491 14397					WRIT-340 66746		BUAD-311 14914	...
W 8			FBE-529 15403					WRIT-340 66785		BUAD-302 14681	...
W 10	ACCT-582 14289	BAEP-451 14378	BUAD-302 14662	MKT-599 16564	BUAD-281 14524	ACCT-574 14202	BUAD-281 14520	WRIT-340 66789	ACCT-373 14057	BUAD-304 14721	...	BUAD-497 15104	BUAD-306 14786	ACCT-470 14116	BAEP-451 14375	ECON-351 26358	ECON-352 26367	ACCT-430 14145	DSO-530 16276	FBE-391 15310	BUAD-311 14904
W 12	GSBA-612 16111	BUAD-497 15108	BUAD-302 14654	ACCT-473 14136	BUAD-497 15110	BUAD-310 14916	ACCT-473 14137	WRIT-340 66734	ACCT-373 14056	BUAD-304 14749	...	MKT-556 16537	BUAD-302 14650	DSO-547 16280	ACCT-470 14115	MOR-588 16720	FBE-421 15324	BUAD-306 14782	ACCT-470 14117	BUAD-311T 14906	MOR-421 16677
W 14		BUAD-281 14530	BAEP-554 14448	ACCT-581 14277	MKT-450 16496		ACCT-372 14051	WRIT-340 66769	FBE-554 15425	BUAD-307 14832	...	BUAD-307 14848	BAEP-551 14446	BUAD-280 14509	BUAD-280 14507	FBE-421 15325	ECON-352 26368	ACCT-370 14028	BUAD-280 14513	FBE-441 15362	ECON-352 26363
W 16			DSO-582 16289					ACCT-574 14203		BUAD-302 14677	...		ACCT-581 14276
W 18			ACCT-580T 14273					WRIT-340 66779		BUAD-302T 14704	...		MOR-598 16730
W 20			BUAD-305 14768					WRIT-340 66716		MKT-445 16493	...		FBE-558 15440
H 8			ACCT-430 14146					WRIT-340 66781		BUAD-302 14683	...		ACCT-372 14052
H 10	BUAD-304 14740	BUAD-281 14521	ACCT-371 14042	BUAD-306 14787	FBE-557 15436		ACCT-474 14140	WRIT-340 66719	MOR-463 16673	BUAD-304 14745	...	BUAD-497 15092	ACCT-416 14105	MKT-533 16530	DSO-510 16302	ECON-352 26365	BUAD-311 14912	ACCT-377 14067	FBE-535 15417	BUAD-311 14903	BUAD-311 14905
H 12		BAEP-451 14379	ACCT-530 14207	MOR-431 16671	BUAD-281 14528		ACCT-528 14242	BUAD-305 14766	ACCT-410 14003	BUAD-304 14731	...	MKT-405 16469	MOR-469 16680	DSO-570 16298	ECON-351 26349	ECON-351 26348	FBE-458 15367	BUAD-280 14508	ACCT-568T 14246	ACCT-474 14141	BUAD-302 14685
H 14		DSO-424 16218	BUAD-201 14486	ACCT-410 14004	ACCT-370 14026		ACCT-528 14228	WRIT-340 66791	MKT-530 16525	BAEP-599 14425	...	MKT-405 16471	BUAD-302T 14701	ACCT-374 14061	ECON-351 26350	BUAD-306 14780	BUAD-306 14783	FBE-400 15315	MOR-579 16725	BUAD-311 14902	ECON-351 26356
H 16								WRIT-340 66728		BUAD-302 14646	...		MKT-525 16518
H 18			DSO-433 16227					WRIT-340 66715		FBE-470 15380	...
H 20			BAEP-491 14397					WRIT-340 66746		BUAD-311 14914	...
F 8											...
F 10		BUAD-425 15005	BUAD-310 14888	BUAD-307 14802	BUAD-304 14732	ACCT-526 14233	BUAD-302T 14708	BUAD-304 14743	BUAD-307 14834	BUAD-304 14727	...	BUAD-307 14820	BUAD-304 14737	BUAD-304 14739	ACCT-570T 14257		BUAD-304 14723	BUAD-310 14893	BUAD-304 14733
F 12		BUAD-425 15008	DSO-401 16215	BUAD-425 15003	BUAD-425 15016	MOR-331 16682	BUAD-252 14591		BUAD-307 14804	BUAD-304 14725	...	BUAD-304 14722	BUAD-307 14808	BUAD-304 14748	BAEP-499 14401			BUAD-310 14891	BUAD-304 14746
F 14		BUAD-304 14752	BUAD-304 14736	BAEP-465 14392	BUAD-425 15013		BAEP-465 14393		BUAD-307 14846	BUAD-310 14899	...	BUAD-302T 14705	BUAD-310 14895	BUAD-302T 14707	BUCO-450 15198	BUAD-304 14747		DSO-401 16216	BUAD-200 14487
F 16											...
F 18											...
F 20											...

35 rows × 45 columns

Output several dataframes to the same excel file

writer = pd.ExcelWriter('output.xlsx')
Summary.to_excel(writer, sheet_name = 'Summary', index = False)
Solution.to_excel(writer, sheet_name = 'Solution', index = False)
ScheduleTable.to_excel(writer, sheet_name = 'Schedule Table')
writer.save()

Results and Conclusions

Our optimization output is actionable, logically coherent from the input data and has plausible assumptions to back it up.

1. Actionable

Using our optimization outputs, The Scheduling Team will be able to get the potential gains as listed below:

Improve the current average seat utilization rate of Marshall courses from 67% to 86%
Increase the percentage of students taking the Marshall courses in prime time from 59% to 92%
Monitor and evaluate the Marshall course schedules and classrooms from one term to another term more efficiently than the manual procedure
Efficiently utilize the available space to schedule all courses and output a feasible schedule on time, well before registration starts
Satisfy the preferences of faculty by prioritizing courses taught in two days will be held at the same time

2. Logically coherent from the input data

The output implies which Marshall courses should be assigned into specific time (on Monday to Friday at 8AM to 10PM) and a specific classroom out of 45 classrooms. From the table we can see, more courses are allocated in the prime time (10AM to 4PM) that implies our model is able to satisfy the preferences of students.
Furthermore, for courses taught in two days, they will be assigned into different days but at the same time (see the whole output for further clarification) and this feature will satisfy the preferences of faculty. All courses are assigned after considering optimal classrooms capacity, possible crash courses, courses taught in two days at the same time and the prime-time preference.

3. Has plausible assumptions to back it up

We believe with the following assumptions, our model would work well for a larger dataset.

We excluded courses longer than two hours. In fact, these courses are around 20% of all the courses so we highly encourage the team to schedule them manually.
At this stage, we decided to overlook time consecutiveness. Actually, we have tried to add this constraint into our model, but it is computationally expensive to run the model (it took more than 24 hours to run the model for a large dataset).
We excluded online courses and courses held in offices.

Reference

Share on

Twitter Facebook Google+ LinkedIn

Yakan Musthofa

Optimizing Course Scheduling at USC Marshall School of Business

Problem Statement

Data Cleaning and Wrangling

a. Import Data

c. Convert the original data

d. Data cleaning

Assumptions and Proposal Evaluation

Diagnostic Analyses

a. Before Optimization: Average Seat Utilization Rate

b. Before Optimization: Percentage of Students Taking Courses in Prime Time

Formulation

1. Performance metrics

2. Data

3. Decision Variable

4. MIP

Optimization Model

1. Input data

2. Build a model

3. Objective function

4. Define some constraints

5. Solve the Output

Solution

Output several dataframes to the same excel file

Results and Conclusions

1. Actionable

2. Logically coherent from the input data

3. Has plausible assumptions to back it up

Reference

Share on

You may also enjoy

New York Property Fraud Detection Using Unsupervised Models

Supervised Model for Detecting Fraudulent Transactions

Forecasting San Francisco Region Home Median Values (Part 2)

Forecasting San Francisco Region Home Median Values (Part 1)

Yakan Musthofa

Problem Statement

Data Cleaning and Wrangling

a. Import Data

b. Define Related Functions

c. Convert the original data

d. Data cleaning

Assumptions and Proposal Evaluation

Diagnostic Analyses

a. Before Optimization: Average Seat Utilization Rate

b. Before Optimization: Percentage of Students Taking Courses in Prime Time

Formulation

1. Performance metrics

2. Data

3. Decision Variable

4. MIP

Optimization Model

1. Input data

2. Build a model

3. Objective function

4. Define some constraints

5. Solve the Output

Solution

Output several dataframes to the same excel file

Results and Conclusions

1. Actionable

2. Logically coherent from the input data

3. Has plausible assumptions to back it up

Reference

Share on

You may also enjoy

New York Property Fraud Detection Using Unsupervised Models

Supervised Model for Detecting Fraudulent Transactions

Forecasting San Francisco Region Home Median Values (Part 2)

Forecasting San Francisco Region Home Median Values (Part 1)