################### Course XML Tutorial ################### EdX uses an XML format to describe the structure and contents of its courses. While much of this is abstracted away by the Studio authoring interface, it is still helpful to understand how the edX platform renders a course. This guide was written with the assumption that you've dived straight into the edX platform without necessarily having any prior programming/CS knowledge. It will be especially valuable to you if your course is being authored with XML files rather than Studio -- in which case you're likely using functionality that is not yet fully supported in Studio. ***** Goals ***** After reading this, you should be able to: * Organize your course content into the files and folders the edX platform expects. * Add new content to a course and make sure it shows up in the courseware. *Prerequisites:* it would be helpful to know a little bit about xml. Here is a `simple example `_ if you've never seen it before. ************ Introduction ************ A course is organized hierarchically. We start by describing course-wide parameters, then break the course into chapters, and then go deeper and deeper until we reach a specific pset, video, etc. You could make an analogy to finding a green shirt in your house -> bedroom -> closet -> drawer -> shirts -> green shirt. We'll begin with a sample course structure as a case study of how XML and files in a course are organized. More technical details will follow, including discussion of some special cases. ********** Case Study ********** Let's jump right in by looking at the directory structure of a very simple toy course:: toy/ course/ course.xml problem/ policies/ roots/ The only top level file is `course.xml`, which should contain one line, looking something like this: .. code-block:: xml This gives all the information to uniquely identify a particular run of any course -- which organization is producing the course, what the course name is, and what "run" this is, specified via the `url_name` attribute. Obviously, this doesn't actually specify any of the course content, so we need to find that next. To know where to look, you need to know the standard organizational structure of our system: course elements are uniquely identified by the combination `(category, url_name)`. In this case, we are looking for a `course` element with the `url_name` "2012_Fall". The definition of this element will be in `course/2012_Fall.xml`. Let's look there next:: toy/ course/ 2012_Fall.xml # <-- Where we look for category="course", url_name="2012_Fall" .. code-block:: xml Aha. Now we've found some content. We can see that the course is organized hierarchically, in this case with only one chapter, with `url_name` "Overview". The chapter contains a `videosequence` and a `video`, with the sequence containing a problem and another video. When viewed in the courseware, chapters are shown at the top level of the navigation accordion on the left, with any elements directly included in the chapter below. Looking at this file, we can see the course structure, and the youtube urls for the videos, but what about the "warmup" problem? There is no problem content here! Where should we look? This is a good time to pause and try to answer that question based on our organizational structure above. As you hopefully guessed, the problem would be in `problem/warmup.xml`. (Note: This tutorial doesn't discuss the xml format for problems -- there are chapters of edx4edx that describe it.) This is an instance of a *pointer tag*: any xml tag with only the category and a url_name attribute will point to the file `{category}/{url_name}.xml`. For example, this means that our toy `course.xml` could have also been written as .. code-block:: xml with `chapter/Overview.xml` containing .. code-block:: xml In fact, this is the recommended structure for real courses -- putting each chapter into its own file makes it easy to have different people work on each without conflicting or having to merge. Similarly, as sequences get large, it can be handy to split them out as well (in `sequence/{url_name}.xml`, of course). Note that the `url_name` is only specified once per element -- either the inline definition, or in the pointer tag. Policy Files ============ We still haven't looked at two of the directories in the top-level listing above: `policies` and `roots`. Let's look at policies next. The policy directory contains one file:: toy/ policies/ 2012_Fall.json and that file is named `{course-url_name}.json`. As you might expect, this file contains a policy for the course. In our example, it looks like this: .. code-block:: json { "course/2012_Fall": { "graceperiod": "2 days 5 hours 59 minutes 59 seconds", "start": "2015-07-17T12:00", "display_name": "Toy Course" }, "chapter/Overview": { "display_name": "Overview" }, "videosequence/Toy_Videos": { "display_name": "Toy Videos", "format": "Lecture Sequence" }, "problem/warmup": { "display_name": "Getting ready for the semester" }, "video/Video_Resources": { "display_name": "Video Resources" }, "video/Welcome": { "display_name": "Welcome" } } The policy specifies metadata about the content elements -- things which are not inherent to the definition of the content, but which describe how the content is presented to the user and used in the course. See below for a full list of metadata attributes; as the example shows, they include `display_name`, which is what is shown when this piece of content is referenced or shown in the courseware, and various dates and times, like `start`, which specifies when the content becomes visible to students, and various problem-specific parameters like the allowed number of attempts. One important point is that some metadata is inherited: for example, specifying the start date on the course makes it the default for every element in the course. See below for more details. It is possible to put metadata directly in the XML, as attributes of the appropriate tag, but using a policy file has two benefits: it puts all the policy in one place, making it easier to check that things like due dates are set properly, and it allows the content definitions to be easily used in another run of the same course, with the same or similar content, but different policy. Roots ===== The last directory in the top level listing is `roots`. In our toy course, it contains a single file:: roots/ 2012_Fall.xml This file is identical to the top-level `course.xml`, containing .. code-block:: xml In fact, the top level `course.xml` is a symbolic link to this file. When there is only one run of a course, the roots directory is not really necessary, and the top-level course.xml file can just specify the `url_name` of the course. However, if we wanted to make a second run of our toy course, we could add another file called, e.g., `roots/2013_Spring.xml`, containing .. code-block:: xml After creating `course/2013_Spring.xml` with the course structure (possibly as a symbolic link or copy of `course/2012_Fall.xml` if no content was changing), and `policies/2013_Spring.json`, we would have two different runs of the toy course in the course repository. Our build system understands this roots structure, and will build a course package for each root. .. note:: If you're using a local development environment, make the top level `course.xml` point to the desired root, and check out the repo multiple times if you need multiple runs simultaneously). That's basically all there is to the organizational structure. Read the next section for details on the tags we support, including some special case tags like `customtag` and `html` invariants, and look at the end for some tips that will make the editing process easier. **** Tags **** .. list-table:: :widths: 10 80 :header-rows: 0 * - `abtest` - Support for A/B testing. TODO: add details.. * - `chapter` - Top level organization unit of a course. The courseware display code currently expects the top level `course` element to contain only chapters, though there is no philosophical reason why this is required, so we may change it to properly display non-chapters at the top level. * - `conditional` - Conditional element, which shows one or more modules only if certain conditions are satisfied. * - `course` - Top level tag. Contains everything else. * - `customtag` - Render an html template, filling in some parameters, and return the resulting html. See below for details. * - `discussion` - Inline discussion forum. * - `html` - A reference to an html file. * - `error` - Don't put these in by hand :) The internal representation of content that has an error, such as malformed XML or some broken invariant. * - `problem` - See elsewhere in edx4edx for documentation on the format. * - `problemset` - Logically, a series of related problems. Currently displayed vertically. May contain explanatory html, videos, etc. * - `sequential` - A sequence of content, currently displayed with a horizontal list of tabs. If possible, use a more semantically meaningful tag (currently, we only have `videosequence`). * - `vertical` - A sequence of content, displayed vertically. Content will be accessed all at once, on the right part of the page. No navigational bar. May have to use browser scroll bars. Content split with separators. If possible, use a more semantically meaningful tag (currently, we only have `problemset`). * - `video` - A link to a video, currently expected to be hosted on youtube. * - `videosequence` - A sequence of videos. This can contain various non-video content; it just signals to the system that this is logically part of an explanatory sequence of content, as opposed to say an exam sequence. Container Tags ============== Container tags include `chapter`, `sequential`, `videosequence`, `vertical`, and `problemset`. They are all specified in the same way in the xml, as shown in the tutorial above. `course` ======== `course` is also a container, and is similar, with one extra wrinkle: the top level pointer tag *must* have `org` and `course` attributes specified--the organization name, and course name. Note that `course` is referring to the platonic ideal of this course (e.g. "6.002x"), not to any particular run of this course. The `url_name` should be the particular run of this course. `conditional` ============= `conditional` is as special kind of container tag as well. Here are two examples: .. code-block:: xml The condition can be either `require_completed`, in which case the required modules must be completed, or `require_attempted`, in which case the required modules must have been attempted. The required modules are specified as a set of `tag`/`url_name`, joined by an ampersand. `customtag` =========== When we see: .. code-block:: xml We will: #. Look for a file called `custom_tags/special` in your course dir. #. Render it as a mako template, passing parameters {'animal':'unicorn', 'hat':'blue'}, generating html. (Google `mako` for template syntax, or look at existing examples). Since `customtag` is already a pointer, there is generally no need to put it into a separate file--just use it in place: .. code-block:: xml `discussion` ============ The discussion tag embeds an inline discussion module. The XML format is: .. code-block:: xml The meaning of each attribute is as follows: .. list-table:: :widths: 10 80 :header-rows: 0 * - `for` - A string that describes the discussion. Purely for descriptive purposes (to the student). * - `id` - The identifier that the discussion forum service uses to refer to this inline discussion module. Since the `id` must be unique and lives in a common namespace with all other courses, the preferred convention is to use `__` as in the above example. The `id` should be "machine-friendly", e.g. use alphanumeric characters, underscores. Do **not** use a period (e.g. `6.002x_Fall_2012_Overview`). * - `discussion_category` - The inline module will be indexed in the main "Discussion" tab of the course. The inline discussions are organized into a directory-like hierarchy. Note that the forward slash indicates depth, as in conventional filesytems. In the above example, this discussion module will show up in the following "directory": `Week 1/Overview/Course overview` Note that the `for` tag has been appended to the end of the `discussion_category`. This can often lead into deeply nested subforums, which may not be intended. In the above example, if we were to use instead: .. code-block:: xml This discussion module would show up in the main forums as `Week 1 / Course overview`, which is more succinct. `html` ====== Most of our content is in XML, but some HTML content may not be proper xml (all tags matched, single top-level tag, etc), since browsers are fairly lenient in what they'll display. So, there are two ways to include HTML content: * If your HTML content is in a proper XML format, just put it in `html/{url_name}.xml`. * If your HTML content is not in proper XML format, you can put it in `html/{filename}.html`, and put `` in `html/{filename}.xml`. This allows another level of indirection, and makes sure that we can read the XML file and then just return the actual HTML content without trying to parse it. `video` ======= Videos have an attribute `youtube`, which specifies a series of speeds + youtube video IDs: .. code-block:: xml