Modern data parallel frameworks, such as Apache Spark, are designed to execute complex data processing jobs that contain a large number of tasks, with dependencies between these tasks represented by a directed acyclic graph (DAG). When scheduling these tasks, the ultimate objective is to minimize the makespan of the schedule, which is equivalent to minimizing the job completion time. With task dependencies, however, minimizing the makespan of the schedule is non-trivial, especially when tasks in the DAG have different resource demands with respect to multiple resource types. In this paper, we present Spear, a new scheduling framework designed to minimize the makespan of complex jobs, while considering both task dependencies and heterogeneous resource demands at the same time. Inspired by recent advances in artificial intelligence, Spear applies Monte Carlo Tree Search (MCTS) in the specific context of task scheduling, and trains a deep reinforcement learning model to guide the expansion and rollout steps in MCTS. With deep reinforcement learning, search efficiency can be significantly improved by focusing on more promising branches. With both simulations and experiments using traces from production workloads, we compare the scheduling performance of Spear with state-of-the-art job schedulers in the literature, and Spear can outperform those approaches by up to 20%. Our results have validated our claims that MCTS and deep reinforcement learning can readily be applied to optimize the scheduling of complex jobs with task dependencies.