Run OptaPlanner workloads on OpenShift, part I.
Have you ever wondered if OptaPlanner can leverage any cloud platform and scale horizontally?
Recently, we have added a new (experimental) optaplanner-operator
module that will simplify running OptaPlanner workloads on Kubernetes.
In this article I am going to show how to use the OptaPlanner Operator to deploy and scale school timetabling on OpenShift.
The source code is available in the Kubernetes demo, along with the other Optaplanner Quickstarts.
Architecture
The demo consists of two projects - the School Timetabling, which defines the optimization problem, and the Demo App, which generates datasets and asks the School Timetabling to solve them. Both these projects are Quarkus applications.
These two parts communicate via Kafka topics created by the OptaPlanner Operator: the school-timetabling-problem
and the school-timetabling-solution
. The Demo App stores a dataset into the PostgreSQL database and sends a message
to the school-timetabling-problem
topic. The School Timetabling reads the message, loads the dataset from the database
and solves it. After that, it stores the solution back to the database and sends a message to the school-timetabling-solution
topic to let the Demo App know the solution is ready for taking.
On the face of it, the PostgreSQL makes the architecture more complex, as the Demo App could have sent the dataset directly in a Kafka message. However, Kafka has been designed to process huge amounts of small messages, which is not exactly our case. The datasets, although not coming in millions, might possibly be huge, requiring some sort of storage to be paired up with the Kafka messages.
Solver Operator
The OptaPlanner Operator is a Quarkus application developed on top of the Java Operator SDK. Its job is to ensure all the Kubernetes resources needed by the solver are in place: it creates Kafka topics and a deployment that runs the solver project; in this case, the School Timetabling.
Running the demo
To run the demo, just clone the Optaplanner Quickstarts and follow the README.
Scaling the School Timetabling
Once you’ve got the demo running and the School Timetabling pod solves datasets you throw at it, it’s time to take it a bit further. Remember, the main reason for deploying all the pieces to OpenShift was to be able to scale horizontally.
To solve multiple datasets in parallel, we have to start more School Timetabling pods and increase the number
school-timetabling-problem
partitions.
The spec.scaling.replicas
in the Solver custom resource defines the number of pods and topic partitions.
In order to have multiple consumers reading different messages from the same Kafka topic without duplication, the consumers must belong to the same consumer group.
The property mp.messaging.incoming.solver_in.group.id=default
in the school-timetabling/src/resources/application.properties
ensures that each pod belongs to the default
consumer group.
Let’s see how the custom resource changes if we want to have three School Timetabling pods:
apiVersion: org.optaplanner.solver/v1beta1
kind: Solver
metadata:
name: school-timetabling
spec:
...
scaling:
replicas: 3
To update the Solver resource:
-
delete the existing Solver resource via
oc delete solver school-timetabling
-
create the updated Solver resource via
oc apply -f <file>
-
check if the
school-timetabling-problem
Kafka topic now has 3 partitions viaoc get kafkatopic school-timetabling-problem
-
check if there are 3 running School Timetabling pods via
oc get pod
In the Demo App, create and send multiple datasets.
Check the logs of individual School Timetabling pods by running oc logs <pod name>
to find out whether they solved some datasets.
The following messages should appear in the logs for each solver dataset:
2022-05-27 11:12:21,336 INFO [org.opt.cor.imp.sol.DefaultSolver] (Thread-3) Solving started: time spent (76), best score (-80init/0hard/0soft), environment mode (REPRODUCIBLE), move thread count (NONE), random (JDK with seed 0). ... 2022-05-27 11:12:31,249 INFO [org.opt.cor.imp.sol.DefaultSolver] (Thread-3) Solving ended: time spent (10001), best score (0hard/18soft), score calculation speed (40162/sec), phase total (2), environment mode (REPRODUCIBLE), move thread count (NONE).
Conclusion
OptaPlanner is starting its journey towards Kubernetes and OpenShift. The nice thing about the outlined architecture is that if you have another planning problem, you just create a new Solver resource pointing to a different container image, and you get a separate deployment and a separate pair of the problem-solution topics.
There is still a lot of things users have to do themselves, things I would like the OptaPlanner Operator to take care of in the future.
Stay tuned, this is just the beginning!
Comments
Visit our forum to comment